Study of Various Forecasting Models for time series data, using Stochastic Processes
Sheetal S. Patil*, S.H. Patil
Department of Computer Engineering, Bharati Vidyapeeth Deemed to be University College of Engineering Pune 411043, India.
*Corresponding Author Email: sspatil@bvucoep.edu.in
ABSTRACT:
The data which is in time stamped format is called as time series data. The time series data is everywhere for example Weather data, Stock market data, health care data, Sensor data, network data, sales data and many more. Time series have various components due to which the time series data became complex. Trend, Seasonality, Cyclical, and irregularities, these are different components. As everyone interested to know about future. Thats why Forecasting using time series data is important point of consideration. This research paper focuses on components of time series data simultaneously study of different time series modelling and forecasting techniques which are based on stochastic processes. Mainly all the models discussed here focus on use of past time series data for forecasting future values. The Research paper covers AR, MA, Random Walk, ARMA, ARIMA, SARIMA, and Exponential Smoothing processes (single, double and triple) which are used for forecasting time series data.
KEYWORDS: Time Series data, ARMA, ARIMA, SARIMA, Exponential Smoothing.
INTRODUCTION:
Everyone is interested to know about the future, if we have power to know about the future so we can get the details of what is going to happen in future, Then definitely we can make alterations to get more suitable results in the field of Financial Gains and Budgets. But it is only imagination, though we cannot have such power, but we can build something i.e. forecasting model which will work same as our imagination.
The Time series data, it is data related with time this is also called as time stamped data. All the observations collected is a sequence of data through repeated measurement over time .This kind of time stamped data is everywhere . Such as financial market data, weather data, health related data, sensor data, click data and network traffic data there are many examples of time series data^{1}.
Someone says correctly The present moment is an accumulation of the past. Time series forecasting thus can be defined as the act of predicting future by understanding the past. But as the nature of time series data is complex, so the complexities are involved in forecasting models using time series data. The various Complex Components of time series data are as follows: Trend, Seasonality, Cyclic and irregularities^{24}
This research paper tries to represent the complex properties of time series data at the same time considering properties, how the different stochastic modeling is applied on time series data for forecasting future.
Stochastic Processes for forecasting:
While processing time series data one must consider the following properties: Autocorrelation, seasonality and stationarity. Autocorrelation: It is the Similarity between observations as a function of the time lag between them. Seasonality: It is the periodic Fluctuations for example online sales increase during Diwali. The seasonality is derived from autocorrelation plot. If in plot we get Sinusoidal shape there is seasonality. Stationarity: A time series is said to be stationary if its statistical properties do not change over a time. In other words it has constant mean and variance as well as covariance is independent of the time. A simple kind of generated series might be a collection of uncorrelated random variables, with mean and ﬁnite variance σ2 .The timeseries generated from uncorrelated variables is used as a model for noise in engineering applications, where it is called white noise. (Kumar*.,2016)
We need stationary time series for forecasting modelling, not all the time series data are stationary, and we need to transform those using different techniques. There are two types of stationarity: strictly stationary and weakly stationary. After transformation we need to test whether time series data get converted into stationary format. For this Dickey fuller test is there. After applying this test, if P>0 the process is not stationary and P=0 then the process is stationary.
Weak Stationarity when there is no systematic change in mean and no systematic change in variance as well as there are no periodic variations. Auto covariance function is just taking covariance of different elements in our sequence
Autocovariance coefficient:
𝐶𝑜𝑉 (𝑋, 𝑌) = [ (𝑋 𝜇_{x) }(𝑌  𝜇_{y})] = 𝐶𝑜𝑣(𝑌, 𝑋)Equation 1^{6}
Autocovariance function: It is just taking covariance of different elements in our sequence of data.
(𝑠,𝑡) = 𝐶𝑜𝑣 (𝑋_{𝑠},) = [ (𝑋_𝑠 −𝜇_{𝑠} )(𝑋_{𝑡}_{ }−𝜇_{𝑡} )] Equation^{2}
( 𝑡,𝑡) = [( t 𝜇_{𝑡}) ^{2} = 𝑉𝑎𝑟( 𝑋_{𝑡}) = 𝜎_{𝑡}^{2}Equation^{3}
𝛾𝑘 = 𝛾 (𝑡, +) ≈ 𝑐_{𝑘}Equation 2^{6}
Autocorrelation coefficient: Auto correlation is the similarity between observations as a function of the time lag between them. The autocorrelation coefficient between x_{t} and x_{t+k}_{ }is defined as
−1 ≤ 𝜌𝑘 =𝛾𝑘/ 𝛾0 ≤ 1 Equation 5^{6}
Estimation of autocorrelation coefficient at lag k
𝑟_{𝑘} = 𝑐_{𝑘} /𝑐_{0} Equation 6
Random Walk Model: Random walk and random series both are the different things. The process used to generate the series forces dependence from one time step to next step, is called as random walk. Random walk can be pictorially represented using correlogram.
AR(p): In an autoregression model, We predict the future time stamp data by observing previous time stamp data ,term autoregression indicates that previous time stamp data is input to the model. Auto regressive model of order p is represented as
𝐴𝑅(𝑝) 𝑝𝑟𝑜𝑐𝑒𝑠𝑠:
𝑋_{𝑡}_{ }= 𝑍_{𝑡} +𝜙_{1}𝑋_{𝑡}_{−1 }+⋯+𝜙_{𝑝}𝑋_{𝑡}_{−p} Equation 7^{6}
Moving average processes: This is the process which considers past errors.
A moving average process will create a new set of random variables from an old set, just like the random walk does,^{7}
𝑀𝐴(𝑞) 𝑝𝑟𝑜𝑐𝑒𝑠𝑠:
Instead of using past time stamped values of the forecast variable in a equation, a moving average model inputs past forecast errors in a regressionlike model.
We refer to this as an MA(q) model, a moving average model of order q. We do not observe the values of _{t }so it is not really a regression in the usual sense.
𝑋_{𝑡} = 𝛽_{0}𝑍_{𝑡} + 𝛽_{1}𝑍 _{𝑡}_{−1} + ⋯ + 𝛽_{𝑞}𝑍 _{𝑡}_{−q} Equation 8^{6}
Autoregressive Moving Average Model ARMA(p,q): Autoregressive moving average models are simply a combination of an AR model and an MA model.
ARMA(𝑝,𝑞) process
𝑋_{𝑡} = 𝜙_{1}𝑋_{𝑡}_{−1 }+𝜙_{2}𝑋_{𝑡}_{−2} +⋯+𝜙_{𝑝}𝑋_{𝑡}_{−}_{𝑝} +𝑍_{𝑡} +𝛽_{1}𝑍_{𝑡}_{−1 }+⋯𝛽_{𝑞}𝑍_{𝑡}_{−}_{𝑞}_{ }Equation 9^{6}
Autoregressive Integrated Moving Average Model: ARIMA(p,d,q): A process 𝑋𝑡 is Autoregressive INTEGRATED Moving Average of order (𝑝,𝑑,𝑞) if
𝑌_{t}= 𝛻^{𝑑}𝑋_{𝑡}_{ }= (1−𝐵) ^{𝑑}𝑋𝑡  Equation 10^{6}
d represents the number of nonseasonal differences needed for stationarity.
SARIMA (p,d,q,P,D,Q): The SARIMA model is an extension of the ARIMA model. The only difference here is that this model added on a seasonal component. As we saw, ARIMA is good for making a nonstationary time series stationary by adjusting the trend. However, the SARIMA model can adjust a nonstationary time series by removing trend and seasonality. Where p,d,q are representing order of AR, differencing And MA for nonseasonal part, and P,D,Q are representing order of AR, differencing And MA for Seasonality.
s  the number of periods in your season
Φ_{P} (𝐵)^{𝑠} 𝜙_{𝑝}_{ (}𝐵)( 1−𝐵^{𝑆}^{)} ^{𝐷} (1−𝐵) ^{𝑑}𝑋_{𝑡} = Θ_{Q}( 𝐵)^{𝑠}^{ }𝜃_{𝑞}(𝐵)𝑍_{𝑡}Equation 11^{6}
Exponential Smoothing: The simplest of the exponentially smoothing methods is naturally called simple exponential smoothing(SES). This method is suitable for forecasting data with no clear trend or seasonal pattern. Exponential smoothing forecasting methods are similar in that a prediction is a weighted sum of past observations, but the model explicitly uses an exponentially decreasing weight for past observations.
Single Exponential Smoothing: SES for short, also called Simple Exponential Smoothing, This method is applied for time series forecasting on univariate data where we are not considering trend or seasonality
Double Exponential Smoothing: is an extension to Exponential Smoothing that explicitly adds support for trends in the univariate time series.
Triple exponential Smoothing: Triple Exponential Smoothing is an extension of Exponential Smoothing that explicitly adds support for seasonality to the univariate time series.
_{}_{= }= 𝛼𝑥_{𝑛} +𝛼(1−𝛼)𝑥_{𝑛}_{−1 }+𝛼(1−𝛼)^{2}𝑥_{𝑛}_{−2 }+⋯+𝛼(1−𝛼)^{𝑘}𝑥_{𝑛}_{−}_{𝑘} +⋯ (𝑆𝐸𝑆) Equation 12^{6}
RESULTS AND DISCUSSIONS:
We have time series data, by applying stochastic processes we get different models, then judging best model depends on quality of the model. With the help of PACF we get p value of AR(p ) model. Similarly we get q value for MA(q) model. With both p and q values we get ARMA(p,q). Also considering trend we get ARIMA(p,d,q) and with seasonality we get SARIMA model. Two common ways of judging the quality of a time series model is by calculating SSE(Sum Of Square of Errors) and AIC(Akaike Information Criterion). We always prefer model with lower AIC value.
For getting details of components of time series data, we have taken Appl stock closing price data from 2Jan 2019 to 31 Dec 2019 i.e one year.
In the below plot there is gradually increasing underlying trend and the rather regular variation superimposed on the trend that seems to repeat over month. Initial transformation for making time series data stationary are shown below using R tool.^{810}
Figure 2: Plot of APPL stock data
Figure 3: Correlogram of APPL stock data
Figure 4: Random walk model on APPL stock
Figure 5: After differencing on APPL stock
Figure 6: ACF plot of APPL Stock Data
CONCLUSION:
Generally forecasting using time series data involves usage of past data observations .Knowing about future impact on our decision making for today. While forecasting with time series data there are a number of challenges. Time series data have different types of components is the biggest challenge. The observation at time t depends on observations at time t1, t2, may be more past observations. Also some observations are not influenced by other variables called as exogenous .Some forecasting models are static, in this case there is no need of retraining.^{11} Some models are dynamic requires retraining of model. When we are building a forecasting a model with stochastic process the general flow of the process is given in diagram below.
Figure 7: General Flow diagram while applying stochastic modeling on time series data.
All the stochastic processes discussed in this research paper outperform for onestep as well as multistep forecasting on univariate data. But for nonlinear time series data forecasting, one can also use machine learning and deep learning methods.
CONFLICT OF INTEREST:
The authors have no conflicts of interest regarding this investigation.
ACKNOWLEDGMENTS:
I have been very fortunate to work with Prof. S.H.Patil, who is supervising the research carried out for my Ph.D. For me he has always been much more than just a guide. He has been constant source of inspiration and motivation. I hope that this research paper work based on review of time series analysis would fulfill their expectation.
REFERENCES:
1. Lawton, R. Time Series Analysis and its Applications. Int. J. Forecast. 2001; 17: 299301.
2. Robert H. Shumway, D. S. S. (2016). TimeSeries Analysis and Its Applications with R Examples. o Title.
3. T.O.Olatayo and Taiwo, A. I. Statistical Modelling and Prediction of Rainfall Time Series Data. Glob. J. Comuter Sci. Technol. 2014; 14: 110.
4. Etuk, E. H. and Mohamed, T. M. Time Series Analysis of Monthly Rainfall data for the Gadaref rainfall station, Sudan, by Sarima Methods. Int. J. Sci. Res. Knowl. 2014; 320327. doi:10.12983/ijsrk2014p03200327
5. Kumar*, V. Time series modeling and forecasting using stochastic models: A review. Int. J. Eng. Sci. Res. Technol. doi:: 10.5281/zenodo.205828
6. Sadigov and Thistleton. Practical Time Series Analysis. https://www.coursera.org Available at: https://www.coursera.org/learn/practicaltimeseriesanalysis.
7. Narasanov, Z. Time Series Forecasting Using a Moving Average Model for Extrapolation of Number of Tourist. UTMS J. Econ. 2018; 9: 121132.
8. Vijh, M., Chandola, D., Tikkiwal, V. A. and Kumar, A. Stock Closing Price Prediction using Machine Learning Techniques. Procedia Comput. Sci. 2020; 67: 599606.
9. Dr. C. Viswanatha Reddy. Predicting the Stock Market Index Using Stochastic Time Series. 2018,
10. Dhyani, B., Kumar, M., Verma, P. and Jain, A. Stock Market Forecasting Technique using Arima Model. Int. J. Recent Technol. Eng. 2020; 8: 26942697.
11. Andreeacristina Petric, Stelian STANCU, A. T. Limitation of ARIMA models in financial and monetary economics. Theor. Appl. Econ. 2016; 23: 1942.
Received on 16.07.2021 Accepted on 13.12.2021 ©A&V Publications all right reserved Research J. Engineering and Tech. 2021;12(4):99104. DOI: 10.52711/2321581X.2021.00017 
