Data Mining Bias The Fool’s Gold of Objective Technical Analysis Copyright © 2005 David R. Aronson

Data Mining BiasData Mining Bias

The Fool’s Gold of The Fool’s Gold of

Objective Objective

Technical Analysis Technical Analysis

Copyright © 2005 David R. Aronson

The ProblemThe ProblemOut-of-Sample PerformanceOut-of-Sample Performance

DeteriorationDeterioration

Definitions

In-sample data : used for rule back-testingObserved performance: rule’s performance in-sample

Out-of-sample dataOut-of-sample data: not used for back-testing: not used for back-testingExpected PerformanceExpected Performance: rule’s performance over : rule’s performance over

all possible futuresall possible futures


Out-of-SampleOut-of-SamplePerformance DeteriorationPerformance Deterioration

TimeTime

$$

CumulativeCumulativeGainsGains

In-SampleIn-Sample Out-of-sampleOut-of-sample

ObservedObservedPerformancePerformance

50%. ROI50%. ROIExpectedExpected

PerformancePerformance5% ROI5% ROI


Objective vs. Subjective TAObjective vs. Subjective TA

Definitive Signal AlgorithmDefinitive Signal AlgorithmBack-Testing PossibleBack-Testing Possible

Interpretation RequiredBack-Testing Not Possible


Objective Technical AnalysisObjective Technical Analysis

GoalGoal: develop mechanical rules that will be : develop mechanical rules that will be profitable in the futureprofitable in the future

Research MethodResearch Method: back-testing: back-testing ObservableObservable: : Back-tested (observed) Back-tested (observed)

performance statisticperformance statistic ROIROI Profit-FactorProfit-Factor Sharpe RatioSharpe Ratio ROI / Ulcer IndexROI / Ulcer Index

InferenceInference: a conclusion rule’s : a conclusion rule’s future future performance performance given given observed performanceobserved performance..


Two Modes of Back TestingTwo Modes of Back TestingTwo Roles for Two Roles for Observed PerformanceObserved Performance

1.1. Single Rule back-testingSingle Rule back-testing• Observed performanceObserved performance is used to is used to

estimate the rule’s estimate the rule’s futurefuture rate of return rate of return

2.2. Data Mining Data Mining : : Many rules are back-Many rules are back-tested & the rule with best tested & the rule with best performance is selectedperformance is selected

Observed performanceObserved performance is used as a is used as a selection criterionselection criterion


Both Roles are LegitimateBoth Roles are Legitimate

Data Miners MistakeData Miners Mistake

Using Using Observed Performance Observed Performance

of the highest performing rule to of the highest performing rule to estimate its estimate its Future PerformanceFuture Performance


Observed PerformanceObserved PerformanceIs Positively BiasedIs Positively Biased

Observed PerformanceObserved Performance > > Expected PerformanceExpected Performance

Therefore: Therefore: Future ROIFuture ROI < < Observed ROIObserved ROI

Rule IdeaRule Idea

Program RuleProgram Rule

Back Back TestTest

PerformanceStatistic

Stop ResearchStop ResearchStart TradingStart Trading

ObservedROI25%

UnbiasedEstimateOf FutureFuture

Rule Rule PerformancePerformance

SingleSingleRuleRuleBackBack

TestingTestingIsIs

NotNotDataData

Mining Mining ROI = 25%ROI = 25%

ExpectedExpectedROIROI

25%25%


Projected Out-of-Sample PerformanceProjected Out-of-Sample Performance For Single Rule Back-Test For Single Rule Back-Test

Time

$

CumulativeGains

ExpectedExpectedOut-of-SampleOut-of-Sample

ROIROIEqualsEquals

In-SampleIn-SampleROIROI

Back-TestedBack-Tested(In-Sample)(In-Sample)

ROI ROI Of Single RuleOf Single Rule


Rule Idea

Program Rule

Back Test

Refined Rule

Alter or AddParameters

The DataMiningLoop

After 3,423 Rules Are TestedA Rule With 500% ROI is Discovered

ROI = 5%ROI = 500 %ROI = 500 %

TheObservedObserved

PerformancePerformance of The Best of The Best Rule out of Rule out of

34233423 Is A PositivelyIs A Positively

BiasedBiased

Rule’sExpectedExpected

Performance Performance Is

Less Than500%500%

MUCH LESS!MUCH LESS!


Expected Out-of-Sample PerformanceExpected Out-of-Sample PerformanceData Mined RuleData Mined Rule

Time

$

CumulativeGains

Observed PerformanceObserved Performanceof Best Ruleof Best Rule

Found By Data MiningFound By Data MiningData Mining

Bias

True ExpectedTrue ExpectedROIROI

False Expectatio

n


Data Mining BiasData Mining BiasDefinitionDefinition

DMB =DMB = (Observed ROI) (Observed ROI) -- ((Expected ROI)Expected ROI) Best Rule Best Rule Best Best RuleRule


SampleSample StatisticStatistic

Value is knownValue is known

PopulationPopulation ParameterParameter

Value is unknownValue is unknown

Two ComponentsTwo Components ofofObserved PerformanceObserved Performance

Ob.Ob.Perf.Perf. ==

PerformancePerformanceDue to Rule’sDue to Rule’sInherent MeritInherent Merit

Predictive PowerPredictive Power+ / -+ / -

RandomRandomEffectsEffectsLuck Luck

good or badgood or bad


Will Persist Will Persist Through Time*Through Time*

Will VaryWill VaryUnpredictablyUnpredictablyThrough TimeThrough Time

Relative Size of Components Relative Size of Components Rule MeritRule Merit vs. vs. RandomnessRandomness

ObservedObservedROIROI ==

ExpectedExpectedPerformancePerformance

(Predictive Power)(Predictive Power)+ / -+ / -

RandomRandomEffectsEffects


Probability Density of ROIProbability Density of ROIFor Rule WithFor Rule With Expected ROI =12% Expected ROI =12%

Rate of ReturnRate of Return

ExpectedROI12%

0 %0 % 7272- 72- 72

+ 48%- 24%

Most LikelyObserved ROI

12%


Imagine 12 Different RulesImagine 12 Different RulesAll Have All Have Expected ROI=10%Expected ROI=10%

10

10

10

10

10

10

10

10

10

10

10

10 70 %Data Data MinersMinersDelightDelight

DMB

Back-Testing More Rules : Higher ProbabilityBack-Testing More Rules : Higher ProbabilityOfOf An Even Luckier An Even Luckier Observed PerformanceObserved Performance

10 10 10 10 10 10

10 10 10 10 10 10

10 10 10 10 10 10

10 10 10 10 10 10

10 10 10 10 10

100 %

Data Miners Delight

10

What Is Data Mining?What Is Data Mining?

General Definition: search for General Definition: search for predictive patterns in datapredictive patterns in data

TA Context: search for rules TA Context: search for rules or models whose signal will or models whose signal will

generategenerate

Good performance in the Good performance in the

FUTUREFUTURE

Data Mining MethodData Mining Method

Human Guided rule search Human Guided rule search

Machine Guided rule searchMachine Guided rule search

Decision TreesDecision Trees

Neural NetworksNeural Networks

Genetic AlgorithmsGenetic Algorithms

Regression SplinesRegression Splines

Support Vector MachinesSupport Vector Machines

Kernel RegressionKernel Regression

Multiple Comparison ProcedureMultiple Comparison Procedure

Data Mining Is AData Mining Is AMultiple Comparison ProcedureMultiple Comparison Procedure

A.A. Problem to solve: Problem to solve: Buy / Sell timingBuy / Sell timing

B.B. Numerous candidate solutions: Numerous candidate solutions: TA RulesTA Rules

C.C. A figure of merit: A figure of merit: ROIROI

D.D. Each candidate’s performance scored by Each candidate’s performance scored by FOM: FOM: Back-Testing Many RulesBack-Testing Many Rules

E.E. Candidate with highest figure of merit is Candidate with highest figure of merit is selected: selected: Rule With Highest ROI ChosenRule With Highest ROI Chosen

MCP Applied to MCP Applied to Finding Violinist For New York Philharmonic Finding Violinist For New York Philharmonic

OrchestraOrchestra

A.A. Problem to solve: Problem to solve: Find best violinistFind best violinist

B.B. A set of candidate solutions: A set of candidate solutions: 100 100 musiciansmusicians

C.C. A figure of merit: A figure of merit: performance on sight-performance on sight-reading test of a complex piecereading test of a complex piece

D.D. Each candidate’s performance scored by Each candidate’s performance scored by figure-of-merit: figure-of-merit: Judges’ ratingJudges’ rating

E.E. Candidate with highest figure of merit is Candidate with highest figure of merit is selected: selected: Candidate scored best by Candidate scored best by judges gets the job.judges gets the job.

Spectrum of Spectrum of RandomnessRandomnessRole Randomness (luck) on Observed PerformanceRole Randomness (luck) on Observed Performance

RandomRandom

MusicCompetition

CompetitionProving

MathTheorems

MonkeysOn

TypewritersAs

Authors

LotteryPlayers

TA RuleData Mining

Merit Based Performance

Non-RandomNon-RandomLuck Based

Performance

Relative ContributionsRelative ContributionsRandomness Randomness && MeritMerit

Vary Across Data Mining VenturesVary Across Data Mining Ventures

0BSERVED

ROI

25%25%

DM VentureDM Venture11



MERIT

MERIT

MERIT

RANDOM

RANDOM

RANDOM

DMBDMB

DMB

The The Data Mining BiasData Mining BiasDepends on Four FeaturesDepends on Four Features

Of the Mining VentureOf the Mining Venture

Four Features of A Data Mining Four Features of A Data Mining Venture That Impact Size of BiasVenture That Impact Size of Bias

DMB (DMB (f f ))Size Size

1. Number of Rules Back-Tested1. Number of Rules Back-Tested

2. Number of Observations Used to 2. Number of Observations Used to Compute Observed PerformanceCompute Observed PerformanceStatisticStatistic

3. Degree of Correlation Among3. Degree of Correlation AmongRule Returns (Rule Similarity)Rule Returns (Rule Similarity)

4. Variation in True Merit (Expected 4. Variation in True Merit (Expected Performance Among Tested RulesPerformance Among Tested Rules

1.1. Number Rules Back-TestedNumber Rules Back-Tested(Size of Rule Universe Searched)(Size of Rule Universe Searched)

Performance Statistic: ROIPerformance Statistic: ROI

Number of Rules Back-TestedNumber of Rules Back-Tested

DM BiasDM BiasOO - E - E

11 ManyMany

2.2. Number ObservationsNumber ObservationsUsed To Compute ROIUsed To Compute ROI

Number of ObservationsNumber of Observations11 ManyMany

DM BiasObs. – Exp.

3.3. Correlation in Rule Returns Correlation in Rule Returns Rule SimilarityRule Similarity

Correlation CoefficientCorrelation Coefficient

0.2 0.4 0.6 0.8 1.0


4.4. Variance In Expected ROIVariance In Expected ROI(True Merit) (True Merit)

Among Rules TestedAmong Rules Tested

Little VarianceLittle VarianceAll Rules MeritAll Rules Merit

High VarianceHigh VarianceSome Some Super Star RulesSuper Star Rules


Worst Case For Data Worst Case For Data MinerMiner

1.1. Large Rule Universe.Large Rule Universe.2.2. Few observations to compute observe Few observations to compute observe

performance.performance.3.3. Highly independent rules,Highly independent rules,4.4. Similar true merit.Similar true merit.

Experimental Experimental InvestigationInvestigation

Data Mining BiasData Mining Bias

With Artificial Trading Rules (ATR)With Artificial Trading Rules (ATR)

With KnownWith Known

Expected PerformanceExpected Performance

Bootstrap GenerationBootstrap Generationof ATR Performance Historiesof ATR Performance Histories

Abs.MonthlyReturns

S&P 500

1928 -2003

Random DeviceRandom DeviceWithWith

Probability ofProbability ofProfitable monthProfitable month

Set ExperimentallySet ExperimentallySampled Sampled WithWith

ReplacementReplacement

$

ATRATRPerformancePerformance

HistoryHistoryOf Of

M monthsM months

Expected ReturnExpected Returnof Artificial Trading Ruleof Artificial Trading Rule

.p : probability of winning month

.r – absolute average monthly S&P return1

ER = ( p x r ) - (1- p x r )

S&P500 absolute monthly return 8/28 – 4/03 = 3.97%S&P500 absolute monthly return 8/28 – 4/03 = 3.97%

ATR Expected ReturnATR Expected Return(f)(f)

% of Profitable Signals% of Profitable Signals

Profitable Signals = 50%Profitable Signals = 50%Expected Return = 0%Expected Return = 0%

Profitable Signals = 63%Profitable Signals = 63%Expected Return = +12.4%Expected Return = +12.4%

S&P500 absolute monthly return 8/28 – 4/03 = 3.97%S&P500 absolute monthly return 8/28 – 4/03 = 3.97%

Win Win

$$

CCUUMM

GGAAIINNSS

00

Five ATR Performance HistoriesAll Rules Have Expected Return = 0

ROI = +3ROI = +3

ROI = -1ROI = -1

ROI = - 3ROI = - 3

ROI = +2ROI = +2

ROI = +9ROI = +9

0%0% +10%+10%-10%-10%

Do This Experiment 10,000 TimesDo This Experiment 10,000 Timesselecting best performing ATR out of 5selecting best performing ATR out of 5

Each dot : observed 24 Month ROI Each dot : observed 24 Month ROI Best Rule out of 5Best Rule out of 5

Expected ROI0%

Luckiest Rule Out of 5 Rules

With Expected ROI= 0%Is

+12%

Experiment 1Experiment 1Data Mining A UniverseData Mining A Universe

Of Rules With Equal Of Rules With Equal Expected ROIExpected ROI

All ATR’s Have All ATR’s Have

Expected ROI = 0%Expected ROI = 0%

7070606050504040303020201010

8.5 %

-70-70 -60-60 -50-50 -40-40 -30-30 -20-20 -10-10

ROI Best Performing Rule In UniverseROI Best Performing Rule In UniverseNumber Rules Tested: 2Number Rules Tested: 2 Number Months: 24 Number Months: 24

00

ExpectedExpectedReturnReturnBestBestRuleRule

ObservedObservedReturnReturnBestBestRuleRule

RelativeRelative

Freq.Freq.

ff

Data Mining Bias = + 8.5%Data Mining Bias = + 8.5%ObservedObserved - - ExpectedExpected

0 %

0 %

7070606050504040303020201010

22 %

-70-70 -60-60 -50-50 -40-40 -30-30 -20-20 -10-10 00

ExpectedReturnBestRule

ObservedReturnBestRule10

Relative

Freq.

f


Data Mining Bias = + 22 %Data Mining Bias = + 22 %ObservedObserved - - ExpectedExpected

0 %

70605040302010

33 %

-70 -60 -50 -40 -30 -20 -10 0



Relative

Freq.

f



0 %

7070606050504040303020201010

48 %

-70-70 -60-60 -50-50 -40-40 -30-30 -20-20 -10-10 00



RelativeRelative

Freq.Freq.

ff



Data Mining Bias (f) Number of Rules TestedData Mining Bias (f) Number of Rules TestedIn annualized ROIIn annualized ROI All Rules Expected ROI = 0% All Rules Expected ROI = 0%

11 22 1010 5050 400400Number of Rules in Universe (log scale)Number of Rules in Universe (log scale)

DDMM

BBIIAASS

10001000

10 %10 %

20 %20 %

30 %30 %

40 %40 %

50 %50 %

60 %60 %

Universe Size: Number of Rules (log scale)Universe Size: Number of Rules (log scale)

10 mos.

100 mos.

1000 mos.1212

2424

3636

4848

6060

7272

8484

9696

Data Mining BiasData Mining Bias (f) Number of Rules Tested (f) Number of Rules Tested@ 4 Levels : Number of Months to Compute @ 4 Levels : Number of Months to Compute Observed ROIObserved ROI

10241024256256646416164411

24 months

DDMM

BBIIAASS

D M BiasD M Bias vs. Number vs. Number ObservationsObservations

1212

2424

3636

4848

6060

7272

8484

9696

1000100080080060060040040020020011

Rule With Best Observed Performance in Universe Size = 2Rule With Best Observed Performance in Universe Size = 2


Number Observations ( Months) In To Compute ROI of ATRNumber Observations ( Months) In To Compute ROI of ATR


DDMM

BBIIAASS

Experiment 2Experiment 2 Data Mining A Universe Data Mining A UniverseOf Rules of Unequal Of Rules of Unequal MeritMerit

(Variation in Expected ROI)(Variation in Expected ROI)

Assumed Probability Distribution Assumed Probability Distribution Rule MeritRule Merit

0

ProbabilityProbability

Expected Performance ROIExpected Performance ROI

High ROIHigh ROI

Approximately 1 Rule in 10,000 Approximately 1 Rule in 10,000 Has Has

Expected ROI > 20% Per Year Expected ROI > 20% Per Year

Universe Average Universe Average Expected ROI = 1.4% Per YearExpected ROI = 1.4% Per Year

6 6

1212

18 18

2424

3030

3636

AnnualAnnualROIROI

ObservedObserved ROIROI

Expected ROIExpected ROI

11

Number Months Used To Compute ROINumber Months Used To Compute ROI

200200 400400 600600 800800 10001000

Observed ROIObserved ROI vs. vs. Expected ROIExpected ROIFor Best Rule out of 10For Best Rule out of 10

Data Mining BiasData Mining Bias4%4%

AnnualAnnualROIROI

Number Months Used To Compute ROINumber Months Used To Compute ROI

Observed ROIObserved ROI vs. vs. Expected ROIExpected ROIFor Best Rule Out of 500For Best Rule Out of 500

1212

6060

2424

4848

3366

11 200200 400400 600600 800800 10001000

ObservedObserved ROIROI

Expected ROIExpected ROI

Data Mining BiasData Mining Bias11%11%

Data Mining Works!Data Mining Works!

Given a sufficient number of Given a sufficient number of observations, back-testing moreobservations, back-testing more

Rules increases the chance of finding Rules increases the chance of finding

Rules of Higher Rules of Higher True Merit (Expected True Merit (Expected ROI)ROI)

Higher Higher Observed In-Sample PerformanceObserved In-Sample PerformancePredictsPredicts

Higher Higher Out-of-Sample PerformanceOut-of-Sample Performance

2.4%2.4%

4.8%4.8%

7.2%7.2%

9.6%9.6%

5050 100100 150150 200200 25025000Universe Size: Number of Rules Back-TestedUniverse Size: Number of Rules Back-Tested

Expected ROI of Expected ROI of Best Performing RuleBest Performing RuleAs Function of Amount of Data Mining (Universe Size)As Function of Amount of Data Mining (Universe Size)

AverageAverageExpectedExpected

ROIROIIn In

UniverseUniverse1.4%/yr1.4%/yr

n =2n =2

n =100n =100

n =1000n =1000

ExpectedExpectedROIROI

Three SolutionsThree Solutions

1.1. Out-of-sample testing – walk forward.Out-of-sample testing – walk forward. Positive: Gives unbiased estimates of Positive: Gives unbiased estimates of

expected return.expected return. Negative: All data can’t be used for analysis. Negative: All data can’t be used for analysis.

2.2. Randomization MethodsRandomization Methods White’s Reality Check: BootstrappingWhite’s Reality Check: Bootstrapping Masters & Aronson Monte Carlo Permutation Masters & Aronson Monte Carlo Permutation

MethodMethod

3.3. Markowitz / Xu Data Mining Bias Markowitz / Xu Data Mining Bias Correction Factor.Correction Factor.

Randomization MethodsRandomization Methods

Generate the sampling distribution Generate the sampling distribution test-statistic under the null-hypothesistest-statistic under the null-hypothesis Test-statistic: Test-statistic: best observed ROIbest observed ROI in a in a

universe of useless rules.universe of useless rules. The The observed performanceobserved performance of the best of the best

rule is compared to this distribution rule is compared to this distribution and a p-value is computed.and a p-value is computed. P-value: the probability of observing an P-value: the probability of observing an

ROI this high or higher given that the ROI this high or higher given that the NULL NULL HYPOTHESIS IS TRUE.HYPOTHESIS IS TRUE.

Requirements For Requirements For Randomization MethodsRandomization Methods

Data for all rules tested during data Data for all rules tested during data

miningmining BootstrappingBootstrapping: every daily or weekly : every daily or weekly

return for each rule tested.return for each rule tested. Monte CarloMonte Carlo: every daily or weekly rule : every daily or weekly rule

position (long, short or neutral) for each position (long, short or neutral) for each rule tested, and daily or weekly market rule tested, and daily or weekly market return for market traded.return for market traded.

Traditional Significance TestTraditional Significance Testforfor

Observed ROI forObserved ROI for Best Rule of 50 Back- Best Rule of 50 Back-TestedTested

7070606050504040303020201010

Observed Performance Best Rule of

5023%

-70-70 -60-60 -50-50 -40-40 -30-30 -20-20 -10-10 00

P < .05P < .05WowWow

Significant ResultSignificant Result

RelativeRelative

Freq.Freq.

ff 0 %Null Hypothesis

Significance Test Corrected for Significance Test Corrected for Data Mining BiasData Mining Bias

Induced by Back-Testing 50 Rules Induced by Back-Testing 50 Rules

7070606050504040303020201010

ObservedPerformanceBest Rule of

5023%

-70-70 -60-60 -50-50 -40-40 -30-30 -20-20 -10-10 00

P-valueP-value>0.65>0.65

NotNotSignificantSignificant

33 %Null Hypothesis

RelativeRelative

Freq.Freq.

ff

RecommendationsRecommendations

1.1. Data mine aggressively: testing many Data mine aggressively: testing many rules increases the chance of finding a rules increases the chance of finding a good one.good one.

2.2. Use as many observations as possible.Use as many observations as possible.

3.3. Save data on all rules tested.Save data on all rules tested.

4.4. Test best rule’s significance adjusted for Test best rule’s significance adjusted for the degree of data mining that led to its the degree of data mining that led to its discovery.discovery.

Documents

Data Mining Bias The Fool’s Gold of Objective Technical Analysis Copyright © 2005 David R. Aronson