Assessment of Forecasting Strategies on Univariate Time Series Data

Anmol Singh Sethi
11 min readMar 11, 2020

Report submitted for the semester project in my college.

Introduction

The expansion planning of Power systems begins with a forecast of anticipated load requirement in the coming future. Estimated ideas of the demand and required energy are integral to an effectively planned system[1]. The generation, transmission and distribution system addition capacities are determined by demand forecasts and the kind of facilities required are determined by the energy forecasts[10]. Energy forecasts are further used to monitor and manage the fuel consumption and procurement rates according the current market prices to maintain an adequate rate of return.

Load forecasting is usually made by the construction of models on relative information such as weather and previous load demand data. Such forecast is majorly reliant for short term load forecasting as long term, such as a month or a year forward would lead to error propagation[2]. In the preceding decades, a number of methodologies for power system load forecasting have been proposed such as ARIMA, AR, MA, ARMA, etc. The idea behind the approach to time series rests on the assumption that a load pattern is just a signal with known timely variations, like daily, weekly, seasonal[5]. These variations aid in presenting a rough estimate of the current load requirement at the given time frame.

There are broadly four categories of load forecasting vis-a-vis:-

  1. Very Short term load forecasting
  2. Short term load forecasting
  3. Medium term load forecasting
  4. Long term load forecasting

Very short term load forecasting is done for a few minutes upto an hour and is majorly used for real time evaluation and security[1]. Short term load forecasting is done for a few hours to a few weeks and is aimed at regulating fuel procurement, short term maintenance scheduling, economic scheduling of required generating capacity. Medium term forecast of upto 5 years ahead are required for transmission system planning, maintenance scheduling, setting of prices, so that demand can be met appropriately. Long term forecasting of upto 20 years is required for regulation policies.

Problem Statement

We are given the problem of forecasting the future demand requirements
of electrical load on the basis of analysis done on some given past data.
The requirement to approach the problem is through a non-conventional
algorithm namely, Long Short Term Memory(LSTM). Results obtained from
some established methods like ARIMA, AR are to be compared with the
results obtained from LSTM to give a better view about our designated
approach.

Literature Survey

Nataraja.C et al. did a comparative study on Short Term Load Forecasting Models using the load data for the year 2011 and 2012 for the state of Karnataka[1]. Various models like Auto-regressive Model, Auto-regressive Moving Average, Auto-regressive Integrated Moving Average were compared against each other. The pipeline involved a multi-phase process which included steps for the initial development, the tuning and modification which was followed by the prediction and result gathering phase. The tested models had an error range from 13.03 percent to 6.15 percent. Gao Gao et al. at the Department of Electronic and Electrical Engineering, University of Strathclyde, Glasgow, UK did a comparative study of two different models namely ARIMA and ANN (Artificial Neural Network) on UK electricity market data[2]. The training was done for a data spread over a span of eight weeks in time. The comparison between these models were made on the basis of root mean squared error. The results of this paper show that Autoregressive Integrated Moving averages gives better results in comparison to Artificial Neural Network. The ANN model consisted of twenty neurons. Chikobvu et al. developed a seasonal auto-regressive integrated moving average and predicted daily peak demand of electricity in South Africa from a period of 1996 to 2009[3]. They concluded that SARIMA model produced better results. Nor Hamizah Miswan et al. compared ARIMA with regression model against the benchmark standard models namely ARIMA and regression models to forecast electricity load demand in a Malaysian city[4]. The parameters used for the comparison of the above listed models included root mean square error along with mean squared error. The results were inclined towards the combined method and thus it is a better model. Another research conducted by I. A. Iwok et al. drew out the comparisons between uni-variate and multivariate time series models[5]. A data set from Nigeria’s Gross Domestic Products were used to compare both approaches.

The parameter to evaluate the performance of these two was mean squared error. The conclusion reported that uni-variate time series analysis which is stationary outperformed the other models. A research by N.A. Abd Jalil et al. focuses on Electrical Load Demand Forecasting using exponential smoothing methods[6]. The dataset used was a half hourly demand of Malaysia for one complete year. For the comparison of forecasting accuracy a parameter called Mean Absolute Percentage Error was used. Various smoothening methodologies were applied namely Holt-Winters Taylor, traditional Holt-Winters etc. V.Venkatesh et al. studied and developed Short Term Load Forecasting Models Using Stochastic Time Series Analysis[7]. They had a dataset for one year which was split into two parts. The first six months were used to train the models and the latter six months were used to test the trained models. Approaches such as AR, ARMA, and ARIMA were developed. Nima Amjady developed a model for Short-Term Hourly Load Forecasting Using Time-Series Modeling with Peak Load Estimation Capability[8]. The main conclusion of this paper was the fact that better results were attained using this particular technique as compared to the traditional Artificial Neural Network Models or the BoxJenkins Model. Hippert et al. in a paper reviewed and evaluated traditional method using neural networks for Short-Term Load Forecasting[9]. Variuos other approaches like using support vector machine for load forecasting for a EUNITE competition[10], using fuzzy neural networks[11], using knowledge based expert systems[12] have been developed or the same analysis. Another approach develops iterative reweighed least squares algorithm for short term power system load forecasting[13]. Another research conducted by Mohamed A. Abu-El-Magd et al. drew a comparison between online and offline methods for short-term electric load forecasting. The load demand was also modelled using multivariate time series analysis [14]. G. T. Heinemann and team studied temperature sensitive and non-temperature sensitive load and did a regression analysis for the same [15].

Data Set Description

1. Fred Economic Data:- This data-set is for Industrial Production: Electric and Gas Utilities and is provided by the Federal Reserve System US. It has monthly frequency.The Board of Governers of that system provided the dataset. The data-set contains the date and the demand column.[16]

Fred Data set

2. Open Power System Data:- This is a time series data-set giving power load data, wind flow data and solar power data consumption, resources, production etc, prices in hourly manner. 6 The data is available for 37 European countries. This data set is provided by Open Knowledge Foundation. Various time series include electricity time consumption or the load, power system modelling etc. Among the many columns, the one used is GB EAW load actual so this column gives the total load for each day in England and Wales.It is published by National Grid. [17]

Open Power System Data Data-set

3. SMARD — Strommarktdsten:- A data set provided by Federal Network Agency, Germany, contains the electricity market data. This data set was created with the aim of improving transparency. The frequency of the data is daily. The data set contains the value of load for every 15 minutes per day. [18]

SMARD

4. Consumption of Household Electric Power Dataset:- This data set is provided by UC Irvine Machine Learning Repository. The data-set contains 2075259 instances for a single household with a sampling period of one minute which is located in France for a total period of 47 months. Nearly 1.25 percent of the values are missing in the data-set. The data-set contains 9 columns out of which two namely date and global reactive power have been taken in consideration. [19]

Individual household electric power consumptionData Set

5. Uttar Pradesh State Load Dispatch Data:- This is a real time data provided by Uttar State Load Dispatch Centre. The website provides real time data for schedule, draw, demand, total SSGS, UP Thermal Generation, deviation rate, etc. The data has been web-scrapped for a period of 1 day. [20]

Individual household electric power consumptionData Set

Methodology

In this section we are going to explain complete steps we followed. The methodology can be represented as the following pipeline:-

In the data pre-processing step we first removed the unwanted columns in the data sets we got. Then we removed and edited the unwanted data. we followed different steps for each data set as each of them were in different formats etc. Then we applied the following methods in each of the following section we explained each method what are those, how do they work and then we explained how we used those methods in our project.

Auto Regression

Regression is a process in which we try to predict one variable with help of one or more other variables. Auto regression is a special type of regression in which prediction of a variable is done with help of the past or previous values of the same variable. For example we can predict the price of gold annually or we can predict the load of electricity over a period of time based on their 10 previous values. Since there can be many other important factors that should be taken into account for example weather in case of electricity load prediction the values might not be absolutely correct but still it makes more sense to find a pattern over a period of time which gives a much stronger prediction.

Auto Regression Integrated Moving Averages — ARIMA

ARIMA stands for Auto Regressive Integrated Moving Average. It contains two different models AR model and MA model and the I terms is number of times we differentiate the data to make it stationary. AR model is explained before the Model takes previous error residuals into account while predicting. ARMA is combination of AR and MA model. Normally we apply ARMA on stationary data. i.e when the mean and variance of the given data remains constant over a period of time. There are basically two different ARIMA models seasonal and Non-seasonal ARIMA. Both models can be used in different situations like if the data exhibits seasonality then we should use seasonal ARIMA else Non-Seasonal. We represent Seasonal ARIMA as SARIMA. The main difference between ARIMA and SARIMA is SARIMA has more parameters. Normal ARIMA is represented as ARIMA(p, d, q) where as SARIMA is represented as ARIMA(p, d, q)(P, D, Q)m, here (P, D, Q) are for seasonal part of the ARIMA. m is used to represent number of periods that are present in each season. Non seasonal ARIMA doesn’t contain P, D, Q values.

Long Short Term Memory — LSTM

Long Short Term Memory(LSTM) Networks are a special type of recurrent neural networks, which have been used for various tasks like music completion, handwriting generation among others. They are much more effective to those specific problem areas, than the standard version. RNNs appeal to computer scientists due to their ability to use recent past data to predict the near future characteristics of the system. But, the problem area is when data older than just the immediately previous plays some part in determining the future output. Here, the gap between relevant information and the point where it is needed becomes sufficiently large to be out of the scope of standard RNNs. Theoretically, RNNs can handle such dependencies, but, in practice, they fail to do so.

LSTMs, on the other hand, are capable of learning long term dependencies. They were introduced by Hochreiter Schmidhuber (1997)[21]. LSTMs were designed to solve the long term dependency problem. By their default behaviour, they remember long term information. LSTMs have a similar chain like structure like RNNs but they differ in the repeating module such that instead of a single neural network layer, they have four.

Results

References

[1] Nataraja, C. Gorwar, Mahesh Shilpa, G.N. Harsha J, Shri. (2012).
Short term load forecasting using time series analysis: A case study
for Karnataka, India. International Journal of Engineering Science and
Innovative Technology. 1. 45–53.

[2] Gao, Gao Lo, Kwoklun Fan, Fulin. (2017). Comparison of ARIMA and
ANN Models Used in Electricity Price Forecasting for Power Market.
Energy and Power Engineering. 09. 120–126. 10.4236/epe.2017.94B015.

[3] Chikobvu, Delson Sigauke, Caston. (2012). Regression-SARIMA modelling of daily peak electricity demand in South Africa. Journal of Energy
in Southern Africa. 23. 23–30. 10.17159/2413–3051/2012/v23i3a3169.

[4] Miswan, Nor Mohd Said, Rahaini Anuar, S.H.H.. (2016). ARIMA with
regression model in modelling electricity load demand. 8. 113–116.

[5] Iwok, Iberedem Okpe, A. (2016). A Comparative Study between
Univariate and Multivariate Linear Stationary Time Series Models. American Journal of Mathematics and Statistics. 2016. 203–212.
10.5923/j.ajms.20160605.02.

[6] Jalil, N.A. Ahmad, Maizah Mohamed, N.. (2013). Electricity load demand forecasting using exponential smoothing methods. World Applied
Sciences Journal. 22. 1540–1543. 10.5829/idosi.wasj.2013.22.11.2891.

[7] “Load Forecasting Bibliography”, Phase I, IEEE Transactions on Power Apparatus and Systems, Vol.PAS-99, №1 January/February 1980.

[8] Short term load forecasting using time series modeling with peak load estimation capability”, IEEE Transactions on Power Systems, Vol.16, №3 August 2001.

[9] Hippert, H.s Pedreira, Carlos Souza, Reinaldo. (2001). Neural Networks for Short-Term Load Forecasting: A Review and Evaluation. Power Systems, IEEE Transactions on. 16. 44–55. 10.1109/59.910780.

[10] “Load forecasting using support vector machines: A study on EUNITE competition 2001”, IEEE Transactions on Power Systems, Vol.19, №4, November 2004.

[11] “Short term load forecasting using fuzzy neural networks”, IEEE Transactions on Power Systems, Vol.10, №3 August 1995.

[12] “Short term load forecasting for fast developing utility using knowledgebased expert systems”, IEEE Transactions on Power Systems, Vol.17, №4, May 2002.

[13] “Short term power system load forecasting using the iteratively reweighed least squares algorithm”, Electric power system research, 19(1990) pp.11–12.

[14] M. A. Abu-El-Magd and N. K. Sinha, ”Short-Term Load Demand Modeling and Forecasting: A Review,” in IEEE Transactions on Systems, Man, and Cybernetics, vol. 12, no. 3, pp. 370–382, May 1982. doi: 10.1109/TSMC.1982.4308827.

[15] G. T. Heinemann, D. A. Nordmian and E. C. Plant, ”The Relationship Between Summer Weather and Summer Loads — A Regression Analysis,” in IEEE Transactions on Power Apparatus and Systems, vol. PAS-85, no. 11, pp. 1144–1154, Nov. 1966.

[16] Board of Governors of the Federal Reserve System (US), Industrial Production: Electric and gas utilities [IPG2211A2N], retrieved from FRED, Federal Reserve Bank of St. Louis; https://fred.stlouisfed.org/series/IPG2211A2N, November 18, 2019.

[17] ”Open Power System Data. 2019. Data Package Time series. Version 2019–06–05. https://doi.org/10.25832/time series/2019–06–05. (Primary data from various sources, for a complete list see URL).”

[18] ”SMARD Version 2019–19–11. https://smard.de,November 19, 2019”

[19] Dua, D. and Graff, C. (2019). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.

[20] Uttar Pradesh State Load Dispatch Centre. 2019. UP Generation Summary. 2019 Version 11–19–2019. http://www.upsldc.org/real-timedata/2019-11-19 27.

[21] Hochreiter, Sepp Schmidhuber, J¨urgen. (1997). Long Short-term Memory. Neural computation. 9. 1735–80. 10.1162/neco.1997.9.8.1735.

Co-Authors

  1. Sai Charan Teja Tanguturu
  2. Manavdeep Singh

Supervisor

Prof. OP Vyas

--

--