D. Asber2007

8/22/2019 D. Asber2007

1/6

Non-parametric short-term load forecasting

D. Asber a, S. Lefebvre a,*, J. Asber b, M. Saad b, C. Desbiens c

a IREQ, Institut de Recherche dHydro-Quebec, 1800 Boul. Lionel Boulet, Varennes, Que., Canada J3X 1S1b Ecole de Technologie Superieure 1100 Notre-Dame Ouest, Montreal, Que., Canada H3C1K3

c Hydro-Quebec Distribution 680 Sherbrooke Ouest, Montreal, Que., Canada H3C4T8

Received 22 March 2005; received in revised form 30 May 2006; accepted 5 September 2006

Abstract

Load forecasting is an important problem in the operation and planning of electrical power generation, as well as in transmission anddistribution networks. This paper is interested by short-term load forecasting. It deals with the development of a reliable and efficientKernel regression model to forecast the load in the Hydro Quebec distribution network.

A set of past load history comprising of weather information and load consumption is used. A non-parametric model serves to estab-lish a relationship among past, current and future temperatures and the system loads. The paper proposes a class of flexible conditionalprobability models and techniques for classification and regression problems. A group of regression models is used, each one focusing onconsumer classes characterising specific load behaviour. Each forecasting process has the information of the past 300 h and yields esti-mated loads for next 120 h. Numerical investigations show that the suggested technique is an efficient way of computing forecast statistics. 2007 Elsevier Ltd. All rights reserved.

Keywords: Distribution network; Forecasting; Time series; Regression; Non-parametric

1. Introduction

In constructing a load forecasting model, a mathemati-cal relation is established between the measured load andvarious factors of influence. The model contains severalcoefficients, with values to be determined, that quantifythe magnitudes of each influence. The coefficient valuesare chosen such that the overall error between model esti-mates and actual measured loads is minimized. The modelis considered valid if tests conducted with numerous histor-

ical data sets result in small overall errors. Improvementsto a deficient model could involve the use of a differentmathematical relationship or of data that are more refined.

Irrespective of the forecasting techniques, weatherparameters are the key factors of Hydro-Quebec short-term load forecasts: temperature and humidity are the mostcommonly used load predictors. Thermostat-based modelshave thus been developed (explanatory models). The time

factors include the day of the week, and the hour of theday because there are important differences in load betweenweekdays and weekends. Furthermore, the load on differ-ent weekdays can behave differently. For example, Mon-days and Fridays being adjacent to weekends, they oftenhave structurally different loads than Tuesday throughThursday. This is particularly true during summer. Holi-days are more difficult to forecast than non-holidaysbecause of their relative infrequent occurrence. For cus-tomer classes with similar load patterns, standard load

curves, namely diagrams of loads as a function of time,can be obtained through load research studies based onmodelling individual customer demands within a specificinterval.

For short-term load forecasting (STLF) several factorsmust be considered such as time factors, weather data,and possible customers classes. Several forecasting meth-ods have been developed using parameter regression ortime series [1,2]. These technologies, despite some limita-tions, are widely used in the industry. Neural networks[36], fuzzy techniques [7] have also been applied, and there

0142-0615/$ - see front matter 2007 Elsevier Ltd. All rights reserved.

doi:10.1016/j.ijepes.2006.09.007

* Corresponding author.E-mail address: [email protected] (S. Lefebvre).

www.elsevier.com/locate/ijepes

Electrical Power and Energy Systems 29 (2007) 630635
mailto:[email protected]:[email protected]

8/22/2019 D. Asber2007

2/6

are numerous publications in scientific journals. Accordingto Hippert et al. [8], although these technologies seem toprovide valid load forecasts, most investigators have usedseemingly misspecified models that have been incompletelytested. More research on the behavior large neural net-works is needed before definite conclusions are drawn on

the suitability of these approaches. To alleviate these prob-lems, a heuristic approach in [9] has been rejuvenated in[10] that proposes using abductive networks. The techniqueis reported to offer the advantages of simplified and moreautomated model synthesis. It also provides analyticalinputoutput models that automatically select influentialinputs. Nonetheless, tuning is still required to improveforecasting accuracy through, for example, the inclusionof hourly temperature data and the development of dedi-cated seasonal models. Charytoniuk et al. [11] presentanother approach to short-term load forecasting. It is anon-parametric regression. The main advantage of thenon-parametric approach is that it is data driven and elim-

inates a need for the statistical analysis aimed at selecting amultivariate distribution fitting the data. This also assuresportability of the proposed method. It can be used in anyutility, regardless of the type of its load distribution.

This paper builds on non-parametric regression andcompares its outcome with the classical approaches. In Sec-tion 2, the paper first presents a simple time series basedforecasting model and a thermostat-based regressionmodel. In Section 3, non-parametric algorithms are usedfor STLF for commercial loads. In Section 4, the probabil-ity forecast functions of the aggregated load are developed.In Section 5, results from different models are presented for

residential and commercial loads. For residential loads, thepaper compares a thermostat-based model and a non-para-metric model. For commercial loads, times series estimatesand non-parametric estimates are compared since thermo-stat-based regression is not representative of these loads.

2. Time series and regression models

2.1. Basic time series model

Time series models use previous or historical values, ofthe data as input to the model, thus

yt %Ffxt;xt dt; :::;xt ndt;yt dt;yt 2dt; :::;yt ndtg 1

where y(t), y(t-dt), y(t-2dt) represent the load at t, t-dt andt-2dt, dt being the time period considered. Here dt is 1 h,xt represents a vector of other factors such as the day,the hour or the temperature.

2.2. A thermostat-based regression model for heating loads

In this model, system load follows closely home heatingrequirements as controlled by thermostats actions. The ran-

dom heating cycles of numerous homes is not represented,

thus the model cannot reproduce the combined load overtime. However, with a single equation, it can reproducethe average load over the considered time period.

The dynamic model for the temperature of an averagehouse having a heater regulated by a thermostat is of theform

dxdt

Ax BM CQT Hx

2

where T is the room temperature as measured by the ther-mostat; x is the thermostat state; M is a vector of possiblynon-linear function of meteorological inputs such as out-side air temperature, solar intensity and wind velocity; Qis the heat flow of space heaters; A, B, C, H are constantmatrices whose values depend on house construction.

The thermostat reduces the thermal characteristics (andthe effects of weather and lifestyle) to two variables: on-duration and off-duration. On-duration d1 is the time inminutes required for the room temperature Tto raise fromthe lower set point (Ts D) to the upper set point Ts whenthe heater is on. The off-duration dois the time in minutesrequired for T to decrease from Ts to (Ts D) when theheater is off. The mean value of the thermostat status undernormal conditions is

b d1d1 do 3

In this model, we substitute in the matrix H the value ofbinstead of using directly the temperature forecast.

3. Non-parametric models

In load forecasting, optimal algorithms often require theknowledge of underlying densities of measurements and/ornoise. As these densities are usually unknown, assumptionsare frequently made that compromise the algorithms per-formance. A common approach to this problem is datadensity estimation. If a particular density form is assumedor known, then parametric estimation is used. If nothing isassumed about the density shape, non-parametric estima-tion is the choice.

The kernel density estimator, also commonly referred toas the Parzen window estimator [12] is non-parametric. It isparticularly attractive when no a priori information isavailable to guide the choice of density with which to fitthe data, for example the number of variables affecting aforecast. A comprehensive review of non-parametric esti-mation is presented in [13]. In time series, the Parzen win-dow is a weighted moving average transformation used tosmooth measurements.

3.1. Parzen window

In the Parzen window approach, a hypercube cell offixed width is used to investigate a region Rn. The region

volume is

D. Asber et al. / Electrical Power and Energy Systems 29 (2007) 630635 631

8/22/2019 D. Asber2007

3/6

Vn hdn 4where hn is the length of the edge of Rn . Define, for exam-ple, the function

uu 1 jujj 612j 1; :::; d

0 otherwise

5

u((x xi)/hn) is equal to unity if xi falls within the hyper-cube of volume Vn centered at x and is equal to zero other-wise. The number of samples (or independent observations)in this hypercube is n, let

Kn Xni1

ux xi

hn

1

n

Xni1

1

Vnu

x xihn

6

The probability estimates Pn(x) are

Pnx 1n

Xni1

1

hnu

x xihn

7

The definitions for u may vary from uniform, triangular,Gaussian and others since the choice in Eq. (5) is not un-ique. This function must satisfy two conditions, namelyRjudu 1 and j(.) is symmetric. By considering a

Gaussian distribution u, Eq. (7) becomes an average ofnormal densities centered on the samples xi. The probabil-ity density function PDF is

^fkx 1nh

Xni1

Kxi xh

8

The scalar function j(.) or u(.) is called Parzen windowfunction or kernel function. The Gaussian kernel is used

in rest of the paper:

uu 1ffiffiffiffiffiffiffi2P

p

expu2=2 and hn h1ffiffiffin

p n 1 1 9

In practice, the kernel function chosen is not nearly asimportant as the kernel size h.

3.2. Simple Kernel regression

Regression estimation aims at finding a relationshipbetween a dependent variable and a set of independentvariables. Kernel regressions are used when we are unwill-

ing to impose a parametric form on the regression equationand there is lot of data.Let the scalars yi be the outputs and xi the data inputs.

Regression equations are specified as

yi mxi ei; 10where E[ei] = Cov [m(xi),ei] = 0 and m(.) is a possibly non-linear function. The term ei is random with mean zero andvariance r2. It defines the variation of yi around its mean,m(xi ) . The mean can be expressed as a function of theprobability density f

m

xi

E

Yi

jxi

x

Ry fx;ydy

Rfx;ydy Ry fx;ydy

fx 11

The kernel smoothed density estimator is assumed to be acombination of the Gaussian distribution as a function of nand the kernel bandwidth h. The general form for this typeof estimator follows

m

xi

Pn

i1Khxi xyiPni1Khxi x where Khxt x exp xtx

2

2h2

h ihffiffiffiffiffiffiffi2PpKu 1ffiffiffiffiffiffiffi

2Pp exp u

2

2

12

The term KkxixPni1Kxix

is the weight given to observation i.

The denominator makes the weights sum equals to unity.As an example, the weight function could give equal weightto the kvalues ofxi that are closest to x and zero weight toall other observations. The N(0, h2) PDF is commonlyused. The choice of h allows us to easily vary the relativeweights of different observations. This weighting function

is positive so all observations get a positive weight. Theweights are largest for observations near x and then tapersoff in a bell-shaped way. A low value of h means that theweights taper off fast; the weight function is then a normalPDF with a low variance.

3.3. Practical regression

In practice we have to estimate mx at a finite number ofpoints x. The load forecast yt depends on many variables(xt, zt, lt, . . ., ht) and the general formulation of a local aver-aging estimator uses the multivariate Kernel regression:

mx Pni1Khxxi xKhzzi zKhlli lyiPni1Khxxi xKhzzi zKhlli l

13

where

Khxi x 1hffiffiffiffiffiffiffi

2Pp exp xi x

2

2h2i

" #14

Optimization estimators, on the other hand, are more ame-nable to incorporating additional structure.

As a prelude to our later discussion, consider the follow-ing estimator. Given

^yi

mxi;y

ih; :::; li

e

15

Then we solve:

minh

1

T

Xi

yih ^yi2 16

3.4. Confidence intervals and bands

Define the confidence intervals D(k) of the estimatedload Pt as follows:

D L; U; L the band lower and U the upper band

of the estimator

632 D. Asber et al. / Electrical Power and Energy Systems 29 (2007) 630635

8/22/2019 D. Asber2007

4/6

Dk fpavr Cstdpe;pavr Cstdpegwhere pavr is the mean value for estimated load,C 1 x1a2x1a2 is the (1 a/2) quintile of the standardGaussian distribution with the probability equal top = 1 a.

4. Smoothing parameters for feeder load profile forecasting

Electric load aggregation is the process by which largecommercial, residential and industrial loads are combinedto form an aggregated load with homogeneous and non-homogeneous loads.

The main objective of this section is to obtain the prob-ability forecast functions of the aggregated load. Themethod consists of two steps. First, smoothing techniquesbased on kernel estimates are applied to derive non-para-metric estimators. The aggregated load is the sum of theindividual category forecasts. Then, parameter smoothing

is applied to form the aggregated load. Parameter smooth-ing is modelled as being unknown but bounded in ampli-tude by a closed convex set.

At any given instant, the aggregated smoothing param-eter hiagr is equal to the sum of the category parameters.

hu agrt XCi1

kihuit 17

Subject to:XCi1

ki 1 18

where u = 1. . ., m, with m the total number of parametersincluded in the load model. Here the parameters areweather and time for a weekday load. This added con-straint requires that the aggregated parameters lie withinthe smallest convex set containing the points formed by

the parameters category loads. The problem is then castin the form of a constrained optimization problem:

minh

1

T

Xt

yth byt2subject to X

C

i1ki

1

ki P 0

19

5. Illustration and comparison of methods

This section is a scoping study for the methodologybased upon a data set from customers in the Montrealregion of the Hydro-Quebec network. In this study, theelectricity consumption of five load categories sampled at60 min intervals was recorded over a month period. The

0 5 10 15 20 25

0.3

0.32

0.34

0.36

0.38

0.4

0.42

0.44

0.46

Time in hours

Loadinp.u.

Forecasting for the residential load

non parametric

method

with measure

thermstat

method

Fig. 1. Load forecast over a 24 h period for a residential load.

0 5 10 15 20 25-0.05

-0.04

-0.03

-0.02

-0.01

0

0.01

0.02

0.03

0.04

0.05

Time in hours

ErreurRelative

thermostat

method

non parametric

load

Relative error over a 24 hours period for a residential load

Fig. 2. Forecasting error for the residential load.

0 5 10 15 20 250.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

Time in hours

Loadinp.u.

series

method

measured

load

non

parametric

load

Load forecast over a 24 hours period for a commercial load

Fig. 3. Forecast over a 24 h period for a commercial load.


8/22/2019 D. Asber2007

5/6

recorded historical data are used to compute the parame-ters of each forecasting technique in the paper. The histor-ical load, the hour and the corresponding temperature areused for the modeling procedure.

Fig. 1 shows the forecasts for a residential load and therecorded values. The non-parametric approach ensures

forecasts very close to the measured load over a 24 h periodof time. The thermostat-based method does not yield thesame accuracy; the forecasting error with these two meth-ods is illustrated in Fig. 2. The non-parametric techniquehas an average error of 0.01 p.u. while the forecasting errorusing the thermostat approach varies between 0.04 and0.05 p.u. From Fig. 2, we compute MAPE = 1.5% for thenon-parametric model, and MAPE = 2.2% with the ther-mostat model.

The second data set represents a commercial load (officeload). Fig. 3 illustrates the forecasted load using long pro-

ven time series [14] and non-parametric techniques. Bothtechniques ensure satisfactory results for very short-termforecasts up to about 7 h ahead. However, for larger peri-ods, the non-parametric approach has a better trackingperformance than the time series approach. The averageerror is illustrated in Fig. 4. From this figure we compute

MAPE = 0.87% for the non-parametric model, andMAPE = 3.04% with time series. Time series models maybe made more accurate, but this requires large amountsof historical good quality data.

Finally, Fig. 5 shows the normalized error between theaggregated commercial load and the sum of the individualcategory forecasts. In this case, MAPE = 2.69%, higherthan without decompositionaggregation.

6. Conclusion

The proposed non-parametric methods for forecastingproposed exhibit a very good performance with respect to

the well-known thermostat-based techniques. These non-parametric algorithms give results that are more consis-tent. Their performance is also very competitive evenwith linear time series. They are automatic proceduresthat do not need of any prior information. In short,the method searches a collection of historical observa-tions for records similar to the current conditions anduses these to estimate the future state of the system.The method is simple, has a sound theoretical basisand provides the best forecast in the sense of a minimumexpected squared error. It requires few parameters thatcan be easily calculated from historical data by applying

the cross validation technique.There are drawbacks to non-parametric estimation.

Whereas parametric models compress all training data intoa set of equations through the process of parameter fitting,non-parametric regression retains the data and searchesthrough them for past similar cases each time a forecastis made. If the number of variables is very large, then theresult is inefficiency of the non-parametric methods. Thiswas not a problem for the load forecasting problem, butmay become one in very short-term estimation.

References

[1] Haida T, Muto S. Regression based peak load forecasting using

transformation technique. IEEETrans Power Syst 1994;9 (4):178894.

[2] Ramanathan R, Enge R, Granger CW, Vahid-Araghi F, Brace C.

Short-term forecasts of electricity loads and peaks. Int J Forecasting

1997;13:16174.

[3] Dillon TS, Sestito S, Leung S. An adaptive neural network approach

in load forecasting in a power system. In: Proceedings of the first

international forum on applications of neural networks to power

systems; 1991. p. 1721.

[4] Peng TM, Hubele NF, Karady G. An adaptive neural network

approach to 1 week ahead load forecasting. IEEE Trans Power Syst

1993;8 (3):1195203.

[5] Kodogiannis VS, Anagnostakis EM. A study of advanced learning

algorithms for short-term load forecasting. Eng Artif Intell 1999;12:

15973.

0 5 10 15 20 25-0.05

-0.04

-0.03

-0.02

-0.01

0

0.01

0.02

0.03

0.04

0.05

Time in hours

ErreurRelative

Error for the commercial loads

non

parametric

method

time series

method

Fig. 4. Forecasting error for the commercial load.

0 10 20 30 40 50 60 70 80 90 1000.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1Difference between aggregate commercial load and the sum of category loads

Errorin

p.u.

Time in hours

measured

aggregated

load

estimated

aggregated

load

Fig. 5. Error between the aggregated commercial load and the sum of

category loads.

634 D. Asber et al. / Electrical Power and Energy Systems 29 (2007) 630635

8/22/2019 D. Asber2007

6/6

[6] Villalba SA, Bel CA. Hybrid demand model for load estimation and

short-term load forecasting in distribution electric systems. IEEE

Trans Power Deliver 2000;15 (2):7649.

[7] Liang RH, Cheng CC. Short-term load forecasting by a neuro-fuzzy

based approach. Electr Power Energ Syst 2002;24:10311.

[8] Hippert HS, Pereira CE, Souza RC. Neural networks for short-term

load forecasting: a review and evaluation. IEEE Trans Power Syst

2001;16 (1):4455.

[9] Dillon TS, Morsztyn K, Phula K. Short-term load forecasting using

adaptive pattern recognition and self organizing techniques. In:

Proceedings of the 5th power system computational conference,

Cambridge, September; 1975.

[10] Abdel-Aal RE. Short-term hourly load forecasting using abductive

networks. IEEE Trans Power Syst 2004;19 (1):16473;

CharytoniukW, ChenSM, OlindaV. Non-parametric regression based

short-termloadforecasting.IEEETransPowerSyst1998;13(3):72530.

[11] Parzen E. On estimation of a probability density function and mode.

Ann Math Stat 1962;33:106576.

[12] Izenman AJ. Recent developments in non-parametric density esti-

mation. J Am Stat Assoc 1991;86:20524.

[13] Duda RO, Hart PE, Stork DG. Pattern classification; 2001.

[14] Mu G, Chen YH, Liu ZF, Fan WD. Studies on the forecasting errors

of the short-term load forecast power system technology. In:

Proceedings of the power con., vol. 1; 2001. p. 636640.


Documents

D. Asber2007