1
1 Introducti on Background Objectives ESP(Ensemble Streamflow Prediction) is a numerical prediction method that is used to generate a sample set of possible future state for streamflow prediction and to analyze uncertainties. Since 1970s, ESP has been effectively used to deal with uncertainties of hydrologic forecasts and is still an active research area for both of short- and long-range predictions. ‘Many ESP are designed to comprise of equally likely (equiprobable) ensemble members, and to have an adequate number of ensemble members in order to describe the full range of input probabilities(Cloke and Pappenberger, 2009).’ However, When there is a correlation between scenarios, the structure of ESP can be distorted. In addition, it is needed to study the effective number of ensemble which implies the minimum number that can maintain acceptable accuracy level of ESP in operational hydrology. Analyze the effects of cross-correlation between ensemble members on the accuracy of ESP as well as the number of ensemble members Determine the effective number of scenarios of ensemble 2 Methodology Overview Generation of Correlated Ensemble Scenarios Cholesky decomposition Correlated ensemble matrix, Y Generating observation according to ‘Nominal’ accuracy Effects of Cross-Correlation between Ensemble Members on Forecasting Accuracy Kim, Young-Oh 1) / Seo, Young-Ho 2) / Park, Dong Kwan 3) 1)Professor, Department of Civil & Environmental Engineering, Seoul National University, Seoul, Korea ( [email protected] ) 2)Master Student, Department of Civil & Environmental Engineering, Seoul National University, Seoul, Korea ( [email protected] ) 3)Master Student, Department of Civil & Environmental Engineering, Seoul National University, Seoul, Korea ( [email protected] ) (a) (b) (c) (d) (e) Figure 1 Behaviors of the Brier score of the generated ensemble forecasts as a function of the ensemble cross-correlation and the number of ensemble members: (a) for the nominal accuracy, = 0.1; (b) 0.3; (c) 0.5; (d) 0.7; (e) 0.9; and (f) integrated results (f) Ensemble data matrix, X Evaluation & Analysis Brier score (originally introduced by Brier) 33.3 % 33.3% 33.3% 0 ≤ BS 2 Estimate the effective number of scenarios This study tried to identify the number of ensemble members to effectively improve the EPS using Brier score The number of scenarios in relation to 90% of the range from the top of the Brier score curve is determined to be the effective number of scenarios, which should be between 3 and 100 scenarios. Identifying the slope of the Brier score curve between each interval. Evaluation Ensemble data matrix, X p x p Factorized matrix, F Correlated ensemble matrix, Y Generating observatio n Number of ensemble member, p 3, 5, 7, 9, 12, 15, 20, 30, 50, 100 Cross-Correlation coefficient, 0, 0.1, 0.3, 0.5, 0.7, 0.9 Controlled accuracy Nominal Accuracy, 0.1, 0.3, 0.5, 0.7, 0.9 Correlation matrix, R Cholesky decompositi on Effects of Cross-correlation on accuracy Effects of correlation on accuracy according to “Nominal accuracy” Estimate the effective number of scenarios The example of estimating the effective number of scenarios 5 10 15 20 25 30 35 40 45 50 0.65 0.7 0.75 0.8 0.85 0.9 0.95 N um berof E nsem ble M embers M ean B rierscore 3 m ax.slope x 5% = 0.0025 m ax.slope = 0.05 m ax.B S = 0.9 m in.B S = 0.66 p f1 = p f2 (m ax.B S -m in.B S )x 90% =(0.9 -0.66)x 90% The effective number can be defined as the closest natural number when its BS drops down to 90% of the difference between the minimum (at p = 100) and the maximum BS (at p = 3). ). This measure is denoted as (=16) . The slope (i.e., the marginal improvement in BS/the increase in p), can be used to define the effective number. The maximum slope occurs at the interval between 3 and 5 and thus this study defines the alternative effective number ( ) as the larger number of the interval where its slope becomes 5% of the maximum slope Results 3 10 20 30 40 50 60 70 80 90 100 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1 1.05 1.1 1.15 Nom inal Accuracy = 0.1 N um berof E nsem ble M embers M ean B rierscore U ncorrelated C orrelation 0.1 C orrelation 0.3 C orrelation 0.5 C orrelation 0.7 C orrelation 0.9 90% Improvem ent 3 10 20 30 40 50 60 70 80 90 100 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1 1.05 Nom inal Accuracy = 0.3 N um berof E nsem ble M embers M ean B rierscore U ncorrelated C orrelation 0.1 C orrelation 0.3 C orrelation 0.5 C orrelation 0.7 C orrelation 0.9 90% Improvement 3 10 20 30 40 50 60 70 80 90 100 0.65 0.7 0.75 0.8 0.85 0.9 Nom inal Accuracy = 0.5 N um berof E nsem ble M embers M ean B rierscore U ncorrelated C orrelation 0.1 C orrelation 0.3 C orrelation 0.5 C orrelation 0.7 C orrelation 0.9 90% Improvement 3 10 20 30 40 50 60 70 80 90 100 0.6 0.62 0.64 0.66 0.68 0.7 0.72 0.74 Nom inal Accuracy = 0.7 N um berof E nsem ble M embers M ean B rierscore U ncorrelated C orrelation 0.1 C orrelation 0.3 C orrelation 0.5 C orrelation 0.7 C orrelation 0.9 90% Improvement 3 10 20 30 40 50 60 70 80 90 100 0.4 0.45 0.5 0.55 0.6 0.65 0.7 0.75 Nom inal Accuracy = 0.9 N um berof E nsem ble M embers M ean B rierscore U ncorrelated C orrelation 0.1 C orrelation 0.3 C orrelation 0.5 C orrelation 0.7 C orrelation 0.9 3 10 20 30 40 50 60 70 80 90 100 0.66 0.68 0.7 0.72 0.74 0.76 0.78 0.8 0.82 0.84 0.86 Integrated R esults N um berof E nsem ble M embers M ean B rierscore U ncorrelated C orrelation 0.1 C orrelation 0.3 C orrelation 0.5 C orrelation 0.7 C orrelation 0.9 3 (*Bold indicates the interval closest to the 5% value of the max slope; i.e., the slope of 3~5) This study was motivated by a hypothesis that more ensemble members may be required when the members are cross-correlated because the existence of cross-correlation generally implies loss of information. A number of synthetic ensemble were generated for various cases of the ensemble cross-correlation, the number of ensemble members, and the forecasting accuracy levels. In the case of inaccurate forecasts, the accuracy of ESP is improved as the ensemble cross-correlation decreases or as the number of ensemble members increases (Figure 1(a), (b), (c)). Contrary to the first conclusion, when the forecasts are very accurate, the accuracy of ESP is improved as the ensemble cross-correlation increases. In particular, when the ensemble cross-correlation is low, the accuracy of ESP is deteriorated as the number of ensemble members increases (Figure 1(e)). A certain accuracy range (around = 0.7) occurs where the ensemble cross-correlation does not affect the forecasting accuracy (Figure 1(d)). Accuracy Correlation Interval of the number of ensemble members 5% Slope 3~5 5~7 7~9 9~12 12~15 15~20 20~30 30~50 50~100 0.1 0 0.0411 0.0175 0.0099 0.0057 0.0035 0.0020 0.0010 0.0004 0.0001 0.0021 0.1 0.0395 0.0168 0.0091 0.0054 0.0035 0.0019 0.0010 0.0004 0.0001 0.0020 0.3 0.0363 0.0147 0.0088 0.0054 0.0029 0.0017 0.0010 0.0004 0.0001 0.0018 0.5 0.0326 0.0131 0.0082 0.0043 0.0023 0.0018 0.0007 0.0004 0.0001 0.0016 0.7 0.0264 0.0106 0.0065 0.0038 0.0022 0.0012 0.0008 0.0003 0.0001 0.0013 0.9 0.0164 0.0066 0.0046 0.0020 0.0018 0.0005 0.0004 0.0002 0.0000 0.0008 0.3 0 0.0331 0.0142 0.0079 0.0045 0.0026 0.0017 0.0008 0.0003 0.0001 0.0017 0.1 0.0318 0.0140 0.0079 0.0043 0.0028 0.0015 0.0008 0.0003 0.0001 0.0016 0.3 0.0300 0.0127 0.0076 0.0041 0.0025 0.0015 0.0008 0.0003 0.0001 0.0015 0.5 0.0276 0.0116 0.0067 0.0039 0.0024 0.0014 0.0007 0.0003 0.0001 0.0014 0.7 0.0239 0.0101 0.0057 0.0032 0.0023 0.0013 0.0005 0.0002 0.0000 0.0012 0.9 0.0167 0.0060 0.0038 0.0023 0.0008 0.0014 0.0003 0.0002 0.0001 0.0008 0.5 0 0.0236 0.0098 0.0056 0.0034 0.0019 0.0012 0.0006 0.0002 0.0001 0.0012 0.1 0.0232 0.0098 0.0060 0.0032 0.0020 0.0011 0.0006 0.0002 0.0001 0.0012 0.3 0.0234 0.0097 0.0058 0.0031 0.0020 0.0010 0.0005 0.0003 0.0001 0.0012 0.5 0.0221 0.0099 0.0049 0.0035 0.0016 0.0010 0.0007 0.0002 0.0001 0.0011 0.7 0.0204 0.0090 0.0045 0.0032 0.0017 0.0010 0.0005 0.0002 0.0001 0.0010 0.9 0.0149 0.0059 0.0038 0.0019 0.0009 0.0007 0.0004 0.0002 0.0001 0.0007 0.7 0 0.0112 0.0050 0.0026 0.0014 0.0010 0.0005 0.0003 0.0001 0.0000 0.0006 0.1 0.0118 0.0051 0.0028 0.0018 0.0007 0.0007 0.0003 0.0001 0.0000 0.0006 0.3 0.0133 0.0053 0.0037 0.0014 0.0010 0.0007 0.0003 0.0002 0.0000 0.0007 0.5 0.0143 0.0063 0.0029 0.0020 0.0012 0.0007 0.0004 0.0001 0.0000 0.0007 0.7 0.0143 0.0068 0.0029 0.0027 0.0009 0.0007 0.0004 0.0002 0.0000 0.0007 0.9 0.0123 0.0052 0.0026 0.0018 0.0009 0.0006 0.0003 0.0001 0.0001 0.0006 Table 1 Slope of the Brier score between each interval Seoul National University Hydrology Research Group http://hrg.snu.ac.kr Nominal Accuracy = 0.1 Nominal Accuracy = 0.3

2 Methodology Overview Generation of Correlated Ensemble Scenarios ② Cholesky decomposition ③ Correlated ensemble matrix, Y ④ Generating observation according

Embed Size (px)

Citation preview

Page 1: 2 Methodology Overview Generation of Correlated Ensemble Scenarios ② Cholesky decomposition ③ Correlated ensemble matrix, Y ④ Generating observation according

1 Introduction

Background

Objectives

ESP(Ensemble Streamflow Prediction) is a numerical prediction method that is used to generate a sample set of possible future state for streamflow prediction and to analyze uncertainties.

Since 1970s, ESP has been effectively used to deal with uncertainties of hydrologic forecasts and is still an active research area for both of short- and long-range predic-tions.

‘Many ESP are designed to comprise of equally likely (equiprobable) ensemble mem-bers, and to have an adequate number of ensemble members in order to describe the full range of input probabilities(Cloke and Pappenberger, 2009).’

However, When there is a correlation between scenarios, the structure of ESP can be distorted. In addition, it is needed to study the effective number of ensemble which implies the minimum number that can maintain acceptable accuracy level of ESP in operational hydrology.

Analyze the effects of cross-correlation between ensemble members on the accuracy of ESP as well as the number of ensemble members

Determine the effective number of scenarios of ensemble

2 Methodology

Overview

Generation of Correlated Ensemble Scenarios

② Cholesky decomposi-tion

③ Correlated ensemble matrix, Y

④ Generating observation according to ‘Nominal’ accuracy

Effects of Cross-Correlation between Ensemble Members on Forecasting Ac-curacyKim, Young-Oh1) / Seo, Young-Ho2) / Park, Dong Kwan3)

1)Professor, Department of Civil & Environmental Engineering, Seoul National University, Seoul, Korea ( [email protected] )2)Master Student, Department of Civil & Environmental Engineering, Seoul National University, Seoul, Korea ( [email protected] )3)Master Student, Department of Civil & Environmental Engineering, Seoul National University, Seoul, Korea ( [email protected] )

(a) (b)

(c) (d)

(e)Figure 1 Behaviors of the Brier score of the generated ensemble forecasts as a function of the ensemble cross-corre-lation and the number of ensemble members: (a) for the nominal accuracy, = 0.1; (b) 0.3; (c) 0.5; (d) 0.7; (e) 0.9; and (f) integrated results

(f)

① Ensemble data matrix, X

Evaluation & Analysis

Brier score (originally introduced by Brier)

33.3 % 33.3% 33.3%

0 ≤ BS ≤ 2

Estimate the effective number of scenarios This study tried to identify the number of ensemble members to effec-

tively improve the EPS using Brier score

① The number of scenarios in relation to 90% of the range from the top of the Brier score curve is determined to be the effective number of scenarios, which should be between 3 and 100 scenarios.

② Identifying the slope of the Brier score curve between each interval.

Evaluation

Ensemble data matrix, X

p x p Factorized matrix, F

Correlated ensemble matrix, Y

Generating ob-servation

Number of ensemble member, p

3, 5, 7, 9, 12, 15, 20, 30, 50, 100

Cross-Correlation coefficient,

0, 0.1, 0.3, 0.5, 0.7, 0.9

Controlled accuracyNominal Accuracy,

0.1, 0.3, 0.5, 0.7, 0.9

Correlation matrix, RCholesky

decomposition

Effects ofCross-correlation on accuracy

Effects of correlationon accuracy according to

“Nominal accuracy”

Estimate the effective number of scenarios

The example of estimating the effective number of scenarios

5 10 15 20 25 30 35 40 45 500.65

0.7

0.75

0.8

0.85

0.9

0.95

Number of Ensemble Members

Me

an

Bri

er

sco

re

3

max. slope x 5% = 0.0025

max. slope = 0.05

max. BS = 0.9

min. BS = 0.66p

f1= p

f2

(max.BS - min.BS) x 90%

=(0.9 - 0.66) x 90%

① The effective number can be defined as the closest natural number when its BS drops down to 90% of the difference between the minimum (at p = 100) and the maximum BS (at p = 3). ). This measure is denoted as (=16)

.

② The slope (i.e., the marginal improvement in BS/the increase in p), can be used to define the effective number. The maximum slope occurs at the interval between 3 and 5 and thus this study defines the alternative effective number ( ) as the larger number of the interval where its slope becomes 5% of the maximum slope

Results3

10 20 30 40 50 60 70 80 90 1000.65

0.7

0.75

0.8

0.85

0.9

0.95

1

1.05

1.1

1.15Nominal Accuracy = 0.1

Number of Ensemble Members

Me

an B

rier

sco

re

UncorrelatedCorrelation 0.1

Correlation 0.3

Correlation 0.5

Correlation 0.7

Correlation 0.990% Improvement

3 10 20 30 40 50 60 70 80 90 1000.65

0.7

0.75

0.8

0.85

0.9

0.95

1

1.05Nominal Accuracy = 0.3

Number of Ensemble Members

Me

an B

rier

sco

re

UncorrelatedCorrelation 0.1

Correlation 0.3

Correlation 0.5

Correlation 0.7

Correlation 0.990% Improvement

3

10 20 30 40 50 60 70 80 90 1000.65

0.7

0.75

0.8

0.85

0.9Nominal Accuracy = 0.5

Number of Ensemble Members

Me

an B

rier

sco

re

UncorrelatedCorrelation 0.1

Correlation 0.3

Correlation 0.5

Correlation 0.7

Correlation 0.990% Improvement

3 10 20 30 40 50 60 70 80 90 1000.6

0.62

0.64

0.66

0.68

0.7

0.72

0.74Nominal Accuracy = 0.7

Number of Ensemble Members

Me

an B

rier

sco

re

UncorrelatedCorrelation 0.1

Correlation 0.3

Correlation 0.5

Correlation 0.7

Correlation 0.990% Improvement

3

10 20 30 40 50 60 70 80 90 1000.4

0.45

0.5

0.55

0.6

0.65

0.7

0.75Nominal Accuracy = 0.9

Number of Ensemble Members

Me

an B

rier

sco

re

UncorrelatedCorrelation 0.1

Correlation 0.3

Correlation 0.5

Correlation 0.7Correlation 0.9

3 10 20 30 40 50 60 70 80 90 1000.66

0.68

0.7

0.72

0.74

0.76

0.78

0.8

0.82

0.84

0.86Integrated Results

Number of Ensemble Members

Me

an B

rier

sco

re

UncorrelatedCorrelation 0.1

Correlation 0.3

Correlation 0.5

Correlation 0.7Correlation 0.9

3

(*Bold indicates the interval closest to the 5% value of the max slope; i.e., the slope of 3~5)

This study was motivated by a hypothesis that more ensemble members may be required when the mem-bers are cross-correlated because the existence of cross-correlation generally implies loss of information. A number of synthetic ensemble were generated for various cases of the ensemble cross-correlation, the number of ensemble members, and the forecasting accuracy levels.

In the case of inaccurate forecasts, the accuracy of ESP is improved as the ensemble cross-correlation de-creases or as the number of ensemble members increases (Figure 1(a), (b), (c)).

Contrary to the first conclusion, when the forecasts are very accurate, the accuracy of ESP is improved as the ensemble cross-correlation increases. In particular, when the ensemble cross-correlation is low, the accuracy of ESP is deteriorated as the number of ensemble members increases (Figure 1(e)).

A certain accuracy range (around = 0.7) occurs where the ensemble cross-correlation does not affect the forecasting accuracy (Figure 1(d)).

Each Brier score curve was observed to be exponentially decreasing, therefore it is possible to determine the effective number of scenarios as it is hypothesized to converge. This study found 20 ~ 25 members can be recommended regardless of the ensemble cross-correlation.

Accuracy CorrelationInterval of the number of ensemble members 5%

Slope3~5 5~7 7~9 9~12 12~15 15~20 20~30 30~50 50~100

0.1

0 0.0411 0.0175 0.0099 0.0057 0.0035 0.0020 0.0010 0.0004 0.0001 0.00210.1 0.0395 0.0168 0.0091 0.0054 0.0035 0.0019 0.0010 0.0004 0.0001 0.00200.3 0.0363 0.0147 0.0088 0.0054 0.0029 0.0017 0.0010 0.0004 0.0001 0.00180.5 0.0326 0.0131 0.0082 0.0043 0.0023 0.0018 0.0007 0.0004 0.0001 0.00160.7 0.0264 0.0106 0.0065 0.0038 0.0022 0.0012 0.0008 0.0003 0.0001 0.0013

0.9 0.0164 0.0066 0.0046 0.0020 0.0018 0.0005 0.0004 0.0002 0.0000 0.0008

0.3

0 0.0331 0.0142 0.0079 0.0045 0.0026 0.0017 0.0008 0.0003 0.0001 0.00170.1 0.0318 0.0140 0.0079 0.0043 0.0028 0.0015 0.0008 0.0003 0.0001 0.00160.3 0.0300 0.0127 0.0076 0.0041 0.0025 0.0015 0.0008 0.0003 0.0001 0.00150.5 0.0276 0.0116 0.0067 0.0039 0.0024 0.0014 0.0007 0.0003 0.0001 0.00140.7 0.0239 0.0101 0.0057 0.0032 0.0023 0.0013 0.0005 0.0002 0.0000 0.0012

0.9 0.0167 0.0060 0.0038 0.0023 0.0008 0.0014 0.0003 0.0002 0.0001 0.0008

0.5

0 0.0236 0.0098 0.0056 0.0034 0.0019 0.0012 0.0006 0.0002 0.0001 0.00120.1 0.0232 0.0098 0.0060 0.0032 0.0020 0.0011 0.0006 0.0002 0.0001 0.00120.3 0.0234 0.0097 0.0058 0.0031 0.0020 0.0010 0.0005 0.0003 0.0001 0.00120.5 0.0221 0.0099 0.0049 0.0035 0.0016 0.0010 0.0007 0.0002 0.0001 0.00110.7 0.0204 0.0090 0.0045 0.0032 0.0017 0.0010 0.0005 0.0002 0.0001 0.0010

0.9 0.0149 0.0059 0.0038 0.0019 0.0009 0.0007 0.0004 0.0002 0.0001 0.0007

0.7

0 0.0112 0.0050 0.0026 0.0014 0.0010 0.0005 0.0003 0.0001 0.0000 0.00060.1 0.0118 0.0051 0.0028 0.0018 0.0007 0.0007 0.0003 0.0001 0.0000 0.00060.3 0.0133 0.0053 0.0037 0.0014 0.0010 0.0007 0.0003 0.0002 0.0000 0.00070.5 0.0143 0.0063 0.0029 0.0020 0.0012 0.0007 0.0004 0.0001 0.0000 0.00070.7 0.0143 0.0068 0.0029 0.0027 0.0009 0.0007 0.0004 0.0002 0.0000 0.00070.9 0.0123 0.0052 0.0026 0.0018 0.0009 0.0006 0.0003 0.0001 0.0001 0.0006

Table 1 Slope of the Brier score between each interval

Seoul National UniversityHydrology Research Grouphttp://hrg.snu.ac.kr

Nominal Accuracy = 0.1 Nominal Accuracy = 0.3