View
235
Download
0
Embed Size (px)
Citation preview
Data Mining BiasData Mining Bias
The Fool’s Gold of The Fool’s Gold of
Objective Objective
Technical Analysis Technical Analysis
Copyright © 2005 David R. Aronson
The ProblemThe ProblemOut-of-Sample PerformanceOut-of-Sample Performance
DeteriorationDeterioration
Definitions
In-sample data : used for rule back-testingObserved performance: rule’s performance in-sample
Out-of-sample dataOut-of-sample data: not used for back-testing: not used for back-testingExpected PerformanceExpected Performance: rule’s performance over : rule’s performance over
all possible futuresall possible futures
Copyright © 2005 David R. Aronson
Out-of-SampleOut-of-SamplePerformance DeteriorationPerformance Deterioration
TimeTime
$$
CumulativeCumulativeGainsGains
In-SampleIn-Sample Out-of-sampleOut-of-sample
ObservedObservedPerformancePerformance
50%. ROI50%. ROIExpectedExpected
PerformancePerformance5% ROI5% ROI
Copyright © 2005 David R. Aronson
Objective vs. Subjective TAObjective vs. Subjective TA
Definitive Signal AlgorithmDefinitive Signal AlgorithmBack-Testing PossibleBack-Testing Possible
Interpretation RequiredBack-Testing Not Possible
Copyright © 2005 David R. Aronson
Objective Technical AnalysisObjective Technical Analysis
GoalGoal: develop mechanical rules that will be : develop mechanical rules that will be profitable in the futureprofitable in the future
Research MethodResearch Method: back-testing: back-testing ObservableObservable: : Back-tested (observed) Back-tested (observed)
performance statisticperformance statistic ROIROI Profit-FactorProfit-Factor Sharpe RatioSharpe Ratio ROI / Ulcer IndexROI / Ulcer Index
InferenceInference: a conclusion rule’s : a conclusion rule’s future future performance performance given given observed performanceobserved performance..
Copyright © 2005 David R. Aronson
Two Modes of Back TestingTwo Modes of Back TestingTwo Roles for Two Roles for Observed PerformanceObserved Performance
1.1. Single Rule back-testingSingle Rule back-testing• Observed performanceObserved performance is used to is used to
estimate the rule’s estimate the rule’s futurefuture rate of return rate of return
2.2. Data Mining Data Mining : : Many rules are back-Many rules are back-tested & the rule with best tested & the rule with best performance is selectedperformance is selected
Observed performanceObserved performance is used as a is used as a selection criterionselection criterion
Copyright © 2005 David R. Aronson
Both Roles are LegitimateBoth Roles are Legitimate
Data Miners MistakeData Miners Mistake
Using Using Observed Performance Observed Performance
of the highest performing rule to of the highest performing rule to estimate its estimate its Future PerformanceFuture Performance
Copyright © 2005 David R. Aronson
Observed PerformanceObserved PerformanceIs Positively BiasedIs Positively Biased
Observed PerformanceObserved Performance > > Expected PerformanceExpected Performance
Therefore: Therefore: Future ROIFuture ROI < < Observed ROIObserved ROI
Rule IdeaRule Idea
Program RuleProgram Rule
Back Back TestTest
PerformanceStatistic
Stop ResearchStop ResearchStart TradingStart Trading
ObservedROI25%
UnbiasedEstimateOf FutureFuture
Rule Rule PerformancePerformance
SingleSingleRuleRuleBackBack
TestingTestingIsIs
NotNotDataData
Mining Mining ROI = 25%ROI = 25%
ExpectedExpectedROIROI
25%25%
Copyright © 2005 David R. Aronson
Projected Out-of-Sample PerformanceProjected Out-of-Sample Performance For Single Rule Back-Test For Single Rule Back-Test
Time
$
CumulativeGains
ExpectedExpectedOut-of-SampleOut-of-Sample
ROIROIEqualsEquals
In-SampleIn-SampleROIROI
Back-TestedBack-Tested(In-Sample)(In-Sample)
ROI ROI Of Single RuleOf Single Rule
Copyright © 2005 David R. Aronson
Rule Idea
Program Rule
Back Test
Refined Rule
Alter or AddParameters
The DataMiningLoop
After 3,423 Rules Are TestedA Rule With 500% ROI is Discovered
ROI = 5%ROI = 500 %ROI = 500 %
TheObservedObserved
PerformancePerformance of The Best of The Best Rule out of Rule out of
34233423 Is A PositivelyIs A Positively
BiasedBiased
Rule’sExpectedExpected
Performance Performance Is
Less Than500%500%
MUCH LESS!MUCH LESS!
Copyright © 2005 David R. Aronson
Expected Out-of-Sample PerformanceExpected Out-of-Sample PerformanceData Mined RuleData Mined Rule
Time
$
CumulativeGains
Observed PerformanceObserved Performanceof Best Ruleof Best Rule
Found By Data MiningFound By Data MiningData Mining
Bias
True ExpectedTrue ExpectedROIROI
False Expectatio
n
Copyright © 2005 David R. Aronson
Data Mining BiasData Mining BiasDefinitionDefinition
DMB =DMB = (Observed ROI) (Observed ROI) -- ((Expected ROI)Expected ROI) Best Rule Best Rule Best Best RuleRule
Copyright © 2005 David R. Aronson
SampleSample StatisticStatistic
Value is knownValue is known
PopulationPopulation ParameterParameter
Value is unknownValue is unknown
Two ComponentsTwo Components ofofObserved PerformanceObserved Performance
Ob.Ob.Perf.Perf. ==
PerformancePerformanceDue to Rule’sDue to Rule’sInherent MeritInherent Merit
Predictive PowerPredictive Power+ / -+ / -
RandomRandomEffectsEffectsLuck Luck
good or badgood or bad
Copyright © 2005 David R. Aronson
Will Persist Will Persist Through Time*Through Time*
Will VaryWill VaryUnpredictablyUnpredictablyThrough TimeThrough Time
Relative Size of Components Relative Size of Components Rule MeritRule Merit vs. vs. RandomnessRandomness
ObservedObservedROIROI ==
ExpectedExpectedPerformancePerformance
(Predictive Power)(Predictive Power)+ / -+ / -
RandomRandomEffectsEffects
Copyright © 2005 David R. Aronson
Probability Density of ROIProbability Density of ROIFor Rule WithFor Rule With Expected ROI =12% Expected ROI =12%
Rate of ReturnRate of Return
ExpectedROI12%
0 %0 % 7272- 72- 72
+ 48%- 24%
Most LikelyObserved ROI
12%
Copyright © 2005 David R. Aronson
Imagine 12 Different RulesImagine 12 Different RulesAll Have All Have Expected ROI=10%Expected ROI=10%
10
10
10
10
10
10
10
10
10
10
10
10 70 %Data Data MinersMinersDelightDelight
DMB
Back-Testing More Rules : Higher ProbabilityBack-Testing More Rules : Higher ProbabilityOfOf An Even Luckier An Even Luckier Observed PerformanceObserved Performance
10 10 10 10 10 10
10 10 10 10 10 10
10 10 10 10 10 10
10 10 10 10 10 10
10 10 10 10 10
100 %
Data Miners Delight
10
What Is Data Mining?What Is Data Mining?
General Definition: search for General Definition: search for predictive patterns in datapredictive patterns in data
TA Context: search for rules TA Context: search for rules or models whose signal will or models whose signal will
generategenerate
Good performance in the Good performance in the
FUTUREFUTURE
Data Mining MethodData Mining Method
Human Guided rule search Human Guided rule search
Machine Guided rule searchMachine Guided rule search
Decision TreesDecision Trees
Neural NetworksNeural Networks
Genetic AlgorithmsGenetic Algorithms
Regression SplinesRegression Splines
Support Vector MachinesSupport Vector Machines
Kernel RegressionKernel Regression
Multiple Comparison ProcedureMultiple Comparison Procedure
Data Mining Is AData Mining Is AMultiple Comparison ProcedureMultiple Comparison Procedure
A.A. Problem to solve: Problem to solve: Buy / Sell timingBuy / Sell timing
B.B. Numerous candidate solutions: Numerous candidate solutions: TA RulesTA Rules
C.C. A figure of merit: A figure of merit: ROIROI
D.D. Each candidate’s performance scored by Each candidate’s performance scored by FOM: FOM: Back-Testing Many RulesBack-Testing Many Rules
E.E. Candidate with highest figure of merit is Candidate with highest figure of merit is selected: selected: Rule With Highest ROI ChosenRule With Highest ROI Chosen
MCP Applied to MCP Applied to Finding Violinist For New York Philharmonic Finding Violinist For New York Philharmonic
OrchestraOrchestra
A.A. Problem to solve: Problem to solve: Find best violinistFind best violinist
B.B. A set of candidate solutions: A set of candidate solutions: 100 100 musiciansmusicians
C.C. A figure of merit: A figure of merit: performance on sight-performance on sight-reading test of a complex piecereading test of a complex piece
D.D. Each candidate’s performance scored by Each candidate’s performance scored by figure-of-merit: figure-of-merit: Judges’ ratingJudges’ rating
E.E. Candidate with highest figure of merit is Candidate with highest figure of merit is selected: selected: Candidate scored best by Candidate scored best by judges gets the job.judges gets the job.
Spectrum of Spectrum of RandomnessRandomnessRole Randomness (luck) on Observed PerformanceRole Randomness (luck) on Observed Performance
RandomRandom
MusicCompetition
CompetitionProving
MathTheorems
MonkeysOn
TypewritersAs
Authors
LotteryPlayers
TA RuleData Mining
Merit Based Performance
Non-RandomNon-RandomLuck Based
Performance
Relative ContributionsRelative ContributionsRandomness Randomness && MeritMerit
Vary Across Data Mining VenturesVary Across Data Mining Ventures
0BSERVED
ROI
25%25%
DM VentureDM Venture11
DM VentureDM Venture22
DM VentureDM Venture33
MERIT
MERIT
MERIT
RANDOM
RANDOM
RANDOM
DMBDMB
DMB
The The Data Mining BiasData Mining BiasDepends on Four FeaturesDepends on Four Features
Of the Mining VentureOf the Mining Venture
Four Features of A Data Mining Four Features of A Data Mining Venture That Impact Size of BiasVenture That Impact Size of Bias
DMB (DMB (f f ))Size Size
1. Number of Rules Back-Tested1. Number of Rules Back-Tested
2. Number of Observations Used to 2. Number of Observations Used to Compute Observed PerformanceCompute Observed PerformanceStatisticStatistic
3. Degree of Correlation Among3. Degree of Correlation AmongRule Returns (Rule Similarity)Rule Returns (Rule Similarity)
4. Variation in True Merit (Expected 4. Variation in True Merit (Expected Performance Among Tested RulesPerformance Among Tested Rules
1.1. Number Rules Back-TestedNumber Rules Back-Tested(Size of Rule Universe Searched)(Size of Rule Universe Searched)
Performance Statistic: ROIPerformance Statistic: ROI
Number of Rules Back-TestedNumber of Rules Back-Tested
DM BiasDM BiasOO - E - E
11 ManyMany
2.2. Number ObservationsNumber ObservationsUsed To Compute ROIUsed To Compute ROI
Number of ObservationsNumber of Observations11 ManyMany
DM BiasObs. – Exp.
3.3. Correlation in Rule Returns Correlation in Rule Returns Rule SimilarityRule Similarity
Correlation CoefficientCorrelation Coefficient
0.2 0.4 0.6 0.8 1.0
DM BiasObs. – Exp.
4.4. Variance In Expected ROIVariance In Expected ROI(True Merit) (True Merit)
Among Rules TestedAmong Rules Tested
Little VarianceLittle VarianceAll Rules MeritAll Rules Merit
High VarianceHigh VarianceSome Some Super Star RulesSuper Star Rules
DM BiasObs. – Exp.
Worst Case For Data Worst Case For Data MinerMiner
1.1. Large Rule Universe.Large Rule Universe.2.2. Few observations to compute observe Few observations to compute observe
performance.performance.3.3. Highly independent rules,Highly independent rules,4.4. Similar true merit.Similar true merit.
Experimental Experimental InvestigationInvestigation
Data Mining BiasData Mining Bias
With Artificial Trading Rules (ATR)With Artificial Trading Rules (ATR)
With KnownWith Known
Expected PerformanceExpected Performance
Bootstrap GenerationBootstrap Generationof ATR Performance Historiesof ATR Performance Histories
Abs.MonthlyReturns
S&P 500
1928 -2003
Random DeviceRandom DeviceWithWith
Probability ofProbability ofProfitable monthProfitable month
Set ExperimentallySet ExperimentallySampled Sampled WithWith
ReplacementReplacement
$
ATRATRPerformancePerformance
HistoryHistoryOf Of
M monthsM months
Expected ReturnExpected Returnof Artificial Trading Ruleof Artificial Trading Rule
.p : probability of winning month
.r – absolute average monthly S&P return1
ER = ( p x r ) - (1- p x r )
S&P500 absolute monthly return 8/28 – 4/03 = 3.97%S&P500 absolute monthly return 8/28 – 4/03 = 3.97%
ATR Expected ReturnATR Expected Return(f)(f)
% of Profitable Signals% of Profitable Signals
Profitable Signals = 50%Profitable Signals = 50%Expected Return = 0%Expected Return = 0%
Profitable Signals = 63%Profitable Signals = 63%Expected Return = +12.4%Expected Return = +12.4%
S&P500 absolute monthly return 8/28 – 4/03 = 3.97%S&P500 absolute monthly return 8/28 – 4/03 = 3.97%
Win Win
$$
CCUUMM
GGAAIINNSS
00
Five ATR Performance HistoriesAll Rules Have Expected Return = 0
ROI = +3ROI = +3
ROI = -1ROI = -1
ROI = - 3ROI = - 3
ROI = +2ROI = +2
ROI = +9ROI = +9
0%0% +10%+10%-10%-10%
Do This Experiment 10,000 TimesDo This Experiment 10,000 Timesselecting best performing ATR out of 5selecting best performing ATR out of 5
Each dot : observed 24 Month ROI Each dot : observed 24 Month ROI Best Rule out of 5Best Rule out of 5
Expected ROI0%
Luckiest Rule Out of 5 Rules
With Expected ROI= 0%Is
+12%
Experiment 1Experiment 1Data Mining A UniverseData Mining A Universe
Of Rules With Equal Of Rules With Equal Expected ROIExpected ROI
All ATR’s Have All ATR’s Have
Expected ROI = 0%Expected ROI = 0%
7070606050504040303020201010
8.5 %
-70-70 -60-60 -50-50 -40-40 -30-30 -20-20 -10-10
ROI Best Performing Rule In UniverseROI Best Performing Rule In UniverseNumber Rules Tested: 2Number Rules Tested: 2 Number Months: 24 Number Months: 24
00
ExpectedExpectedReturnReturnBestBestRuleRule
ObservedObservedReturnReturnBestBestRuleRule
RelativeRelative
Freq.Freq.
ff
Data Mining Bias = + 8.5%Data Mining Bias = + 8.5%ObservedObserved - - ExpectedExpected
0 %
0 %
7070606050504040303020201010
22 %
-70-70 -60-60 -50-50 -40-40 -30-30 -20-20 -10-10 00
ExpectedReturnBestRule
ObservedReturnBestRule10
Relative
Freq.
f
ROI Best Performing Rule In UniverseROI Best Performing Rule In UniverseNumber Rules Tested: 10Number Rules Tested: 10 Number Months: 24 Number Months: 24
Data Mining Bias = + 22 %Data Mining Bias = + 22 %ObservedObserved - - ExpectedExpected
0 %
70605040302010
33 %
-70 -60 -50 -40 -30 -20 -10 0
ExpectedReturnBestRule
ObservedReturnBestRule50
Relative
Freq.
f
ROI Best Performing Rule In UniverseROI Best Performing Rule In UniverseNumber Rules Tested: 50Number Rules Tested: 50 Number Months: 24 Number Months: 24
Data Mining Bias = + 33 %Data Mining Bias = + 33 %ObservedObserved - - ExpectedExpected
0 %
7070606050504040303020201010
48 %
-70-70 -60-60 -50-50 -40-40 -30-30 -20-20 -10-10 00
ExpectedReturnBestRule
ObservedReturnBestRule400
RelativeRelative
Freq.Freq.
ff
ROI Best Performing Rule In UniverseROI Best Performing Rule In UniverseNumber Rules Tested: 400Number Rules Tested: 400 Number Months: 24 Number Months: 24
Data Mining Bias = + 48 %Data Mining Bias = + 48 %ObservedObserved - - ExpectedExpected
Data Mining Bias (f) Number of Rules TestedData Mining Bias (f) Number of Rules TestedIn annualized ROIIn annualized ROI All Rules Expected ROI = 0% All Rules Expected ROI = 0%
11 22 1010 5050 400400Number of Rules in Universe (log scale)Number of Rules in Universe (log scale)
DDMM
BBIIAASS
10001000
10 %10 %
20 %20 %
30 %30 %
40 %40 %
50 %50 %
60 %60 %
Universe Size: Number of Rules (log scale)Universe Size: Number of Rules (log scale)
10 mos.
100 mos.
1000 mos.1212
2424
3636
4848
6060
7272
8484
9696
Data Mining BiasData Mining Bias (f) Number of Rules Tested (f) Number of Rules Tested@ 4 Levels : Number of Months to Compute @ 4 Levels : Number of Months to Compute Observed ROIObserved ROI
10241024256256646416164411
24 months
DDMM
BBIIAASS
D M BiasD M Bias vs. Number vs. Number ObservationsObservations
1212
2424
3636
4848
6060
7272
8484
9696
1000100080080060060040040020020011
Rule With Best Observed Performance in Universe Size = 2Rule With Best Observed Performance in Universe Size = 2
Rule With Best Observed Performance in Universe Size = 10Rule With Best Observed Performance in Universe Size = 10
Number Observations ( Months) In To Compute ROI of ATRNumber Observations ( Months) In To Compute ROI of ATR
Rule With Best Observed Performance in Universe Size = 100Rule With Best Observed Performance in Universe Size = 100
DDMM
BBIIAASS
Experiment 2Experiment 2 Data Mining A Universe Data Mining A UniverseOf Rules of Unequal Of Rules of Unequal MeritMerit
(Variation in Expected ROI)(Variation in Expected ROI)
Assumed Probability Distribution Assumed Probability Distribution Rule MeritRule Merit
0
ProbabilityProbability
Expected Performance ROIExpected Performance ROI
High ROIHigh ROI
Approximately 1 Rule in 10,000 Approximately 1 Rule in 10,000 Has Has
Expected ROI > 20% Per Year Expected ROI > 20% Per Year
Universe Average Universe Average Expected ROI = 1.4% Per YearExpected ROI = 1.4% Per Year
6 6
1212
18 18
2424
3030
3636
AnnualAnnualROIROI
ObservedObserved ROIROI
Expected ROIExpected ROI
11
Number Months Used To Compute ROINumber Months Used To Compute ROI
200200 400400 600600 800800 10001000
Observed ROIObserved ROI vs. vs. Expected ROIExpected ROIFor Best Rule out of 10For Best Rule out of 10
Data Mining BiasData Mining Bias4%4%
AnnualAnnualROIROI
Number Months Used To Compute ROINumber Months Used To Compute ROI
Observed ROIObserved ROI vs. vs. Expected ROIExpected ROIFor Best Rule Out of 500For Best Rule Out of 500
1212
6060
2424
4848
3366
11 200200 400400 600600 800800 10001000
ObservedObserved ROIROI
Expected ROIExpected ROI
Data Mining BiasData Mining Bias11%11%
Data Mining Works!Data Mining Works!
Given a sufficient number of Given a sufficient number of observations, back-testing moreobservations, back-testing more
Rules increases the chance of finding Rules increases the chance of finding
Rules of Higher Rules of Higher True Merit (Expected True Merit (Expected ROI)ROI)
Higher Higher Observed In-Sample PerformanceObserved In-Sample PerformancePredictsPredicts
Higher Higher Out-of-Sample PerformanceOut-of-Sample Performance
2.4%2.4%
4.8%4.8%
7.2%7.2%
9.6%9.6%
5050 100100 150150 200200 25025000Universe Size: Number of Rules Back-TestedUniverse Size: Number of Rules Back-Tested
Expected ROI of Expected ROI of Best Performing RuleBest Performing RuleAs Function of Amount of Data Mining (Universe Size)As Function of Amount of Data Mining (Universe Size)
AverageAverageExpectedExpected
ROIROIIn In
UniverseUniverse1.4%/yr1.4%/yr
n =2n =2
n =100n =100
n =1000n =1000
ExpectedExpectedROIROI
Three SolutionsThree Solutions
1.1. Out-of-sample testing – walk forward.Out-of-sample testing – walk forward. Positive: Gives unbiased estimates of Positive: Gives unbiased estimates of
expected return.expected return. Negative: All data can’t be used for analysis. Negative: All data can’t be used for analysis.
2.2. Randomization MethodsRandomization Methods White’s Reality Check: BootstrappingWhite’s Reality Check: Bootstrapping Masters & Aronson Monte Carlo Permutation Masters & Aronson Monte Carlo Permutation
MethodMethod
3.3. Markowitz / Xu Data Mining Bias Markowitz / Xu Data Mining Bias Correction Factor.Correction Factor.
Randomization MethodsRandomization Methods
Generate the sampling distribution Generate the sampling distribution test-statistic under the null-hypothesistest-statistic under the null-hypothesis Test-statistic: Test-statistic: best observed ROIbest observed ROI in a in a
universe of useless rules.universe of useless rules. The The observed performanceobserved performance of the best of the best
rule is compared to this distribution rule is compared to this distribution and a p-value is computed.and a p-value is computed. P-value: the probability of observing an P-value: the probability of observing an
ROI this high or higher given that the ROI this high or higher given that the NULL NULL HYPOTHESIS IS TRUE.HYPOTHESIS IS TRUE.
Requirements For Requirements For Randomization MethodsRandomization Methods
Data for all rules tested during data Data for all rules tested during data
miningmining BootstrappingBootstrapping: every daily or weekly : every daily or weekly
return for each rule tested.return for each rule tested. Monte CarloMonte Carlo: every daily or weekly rule : every daily or weekly rule
position (long, short or neutral) for each position (long, short or neutral) for each rule tested, and daily or weekly market rule tested, and daily or weekly market return for market traded.return for market traded.
Traditional Significance TestTraditional Significance Testforfor
Observed ROI forObserved ROI for Best Rule of 50 Back- Best Rule of 50 Back-TestedTested
7070606050504040303020201010
Observed Performance Best Rule of
5023%
-70-70 -60-60 -50-50 -40-40 -30-30 -20-20 -10-10 00
P < .05P < .05WowWow
Significant ResultSignificant Result
RelativeRelative
Freq.Freq.
ff 0 %Null Hypothesis
Significance Test Corrected for Significance Test Corrected for Data Mining BiasData Mining Bias
Induced by Back-Testing 50 Rules Induced by Back-Testing 50 Rules
7070606050504040303020201010
ObservedPerformanceBest Rule of
5023%
-70-70 -60-60 -50-50 -40-40 -30-30 -20-20 -10-10 00
P-valueP-value>0.65>0.65
NotNotSignificantSignificant
33 %Null Hypothesis
RelativeRelative
Freq.Freq.
ff
RecommendationsRecommendations
1.1. Data mine aggressively: testing many Data mine aggressively: testing many rules increases the chance of finding a rules increases the chance of finding a good one.good one.
2.2. Use as many observations as possible.Use as many observations as possible.
3.3. Save data on all rules tested.Save data on all rules tested.
4.4. Test best rule’s significance adjusted for Test best rule’s significance adjusted for the degree of data mining that led to its the degree of data mining that led to its discovery.discovery.