class: center, middle, inverse, title-slide # IM532 3.0 Applied Time Series Forecasting ## MSc in Industrial Mathematics ### Introduction to ARIMA models ### 8 March 2020 --- ## Stochastic processes "A statistical phenomenon that evolves in time according to probabilistic laws is called a stochastic process." (Box, George EP, et al. Time series analysis: forecasting and control.) -- ## Non-deterministic time series or statistical time series A sample realization from an infinite population of time series that could have been generated by a stochastic process. -- ## Probabilistic time series model Let `\(\{X_1, X_2, ...\}\)` be a sequence of random variables. Then the joint distribution of the random vector `\([X_1, X_2, ..., X_n]'\)` is `$$P[X_1 \le x_1, X_2 \le x_2, ..., X_n \le x_n]$$` where `\(-\infty < x_1, ..., x_n < \infty\)` and `\(n = 1, 2, ...\)` --- ## Mean function -- The **mean function** of `\(\{X_t\}\)` is `$$\mu_X(t)=E(X_t).$$` -- ## Covariance function The **covariance function** of `\(\{X_t\}\)` is `$$\gamma_X(r, s)=Cov(X_r, X_s)=E[(X_r-\mu_X(r))(X_s-\mu_X(s))]$$` for all integers `\(r\)` and `\(s\)`. -- The covariance function of `\(\{X_t\}\)` at lag `\(h\)` is defined by `$$\gamma_X(h):=\gamma_X(h, 0)=\gamma(t+h, t)=Cov(X_{t+h}, X_t).$$` --- ## Autocovariance function The auto covariance function of `\({X_t}\)` at lag `\(h\)` is `$$\gamma_X(h)=Cov(X_{t+h}, X_t).$$` -- ## Autocorrelation function The autocorrelation function of `\({X_t}\)` at lag `\(h\)` is `$$\rho_X(h)=\frac{\gamma_X(h)}{\gamma_X(0)}=Cor(X_{t+h}, X_t).$$` --- ## Weekly stationary A time series `\(\{X_t\}\)` is called weekly stationary if - `\(\mu_X(t)\)` is independent of `\(t\)`. - `\(\gamma_X(t+h, t)\)` is independent of `\(t\)` for each `\(h\)`. In other words the statistical properties of the time series (mean, variance, autocorrelation, etc.) do not depend on the time at which the series is observed, that is no trend or seasonality. However, a time series with cyclic behaviour (but with no trend or seasonality) is stationary. -- ## Strict stationarity of a time series A time series `\(\{X_t\}\)` is called weekly stationary if the random vector `\([X_1, X_2..., X_n]'\)` and `\([X_{1+h}, X_{2+h}..., X_{n+h}]'\)` have the same joint distribution for all integers `\(h\)` and `\(n>0\)`. --- # Simple time series models ## 1. iid noise 1. no trend or seasonal component 2. observations are independent and identically distributed (iid) random variables with zero mean. 3. Notation: `\(\{X_t\} \sim IID(0, \sigma^2)\)` 4. plays an important role as a building block for more complicated time series. --------------------------------------------------------------------------------- **Joint distribution function of iid noise** `$$P[X_1 \le x_1, X_2 \le x_2, ..., X_n \le x_n] = P[X_1 \le x_1]....P[X_n \le x_n].$$` -- Let `\(F_X(.)\)` be the cumulative distribution function of each of the identically distributed random variables `\(X_1, X_2, ...\)` Then, `$$P[X_1 \le x_1, X_2 \le x_2, ..., X_n \le x_n] = F_X(x_1)...F_X(x_n)$$` -- There is no dependence between observations. Hence, `$$P[X_{n+h} \le x|X_1=x_1, ..., X_n=x_n] = P[X_{n+h} \le x].$$` --- ## iid noise (cont.) **Examples of iid noise** 1. Sequence of iid random variables `\(\{X_t, t=1, 2,...\}\)` with `\(P(X_t=1)=\frac{1}{2}\)` and `\(P(X_t = 0)=\frac{1}{2}.\)` 2. ```r rnorm(15) ``` ``` [1] 2.61665472 -1.47748881 -0.75695121 -1.70520956 -1.02075354 0.81739417 [7] 0.01909737 -1.29025062 0.84205925 0.27636792 -1.37133390 -0.89830051 [13] -0.62070173 0.15142641 0.40849453 ``` -- ## Question Let `\(\{X_t\}\)` is a iid noise with `\(E(X^2)=\sigma^2<\infty\)`. Show that `\(\{X_t\}\)` is a stationary process. --- # Simple time series models ## 2. White noise If `\(\{X_t\}\)` is a sequence of uncorrelated random variables, each with zero mean and variance `\(\sigma^2\)`, then such a sequence is referred to as **white noise**. Note: Every `\(IID(0, \sigma^2)\)` sequence is `\(WN(0, \sigma^2)\)` but not conversely. --- # Simple time series models ## 3. Random walk A random walk process is obtained by cumulatively summing iid random variables. If `\(\{S_t, t=0, 1, 2, ...\}\)` is a random walk process, then `\(S_0 =0\)` `\(S_1=0+X_1\)` `\(S_2=0+X_1+X_2\)` `\(...\)` `\(S_t=X_1+X_2+...+X_t.\)` -- ## Question Is `\(\{S_t, t=0, 1, 2, ...\}\)` a weak stationary process? --- ## Identifying non-stationarity in the mean - Using time series plot - ACF plot - ACF of stationary time series will drop to relatively quickly. - The ACF of non-stationary series decreases slowly. - For non-stationary series, the ACF at lag 1 is often large and positive. --- # Elimination of Trend and Seasonality by Differencing - Differencing helps to stabilize the mean. ## Backshift notation: `$$BX_t=X_{t-1}$$` ## Ordinary differencing The first-order differencing can be defined as `$$\nabla X_t = X_t-X_{t-1}=X_t-BX_t=(1-B)X_t$$` where `\(\nabla=1-B\)`. The second-order differencing `$$\nabla^2X_t=\nabla(\nabla X_t)=\nabla(X_t-X_{t-1})=\nabla X_t - \nabla X_{t-1}=(X_t-X_{t-1})-(X_{t-1}-X_{t-2})$$` - In practice, we seldom need to go beyond second order differencing. --- ## Seasonal differencing - differencing between an observation and the corresponding observation from the previous year. `$$\nabla_mX_t=X_t-X_{t-m}=(1-B^m)X_t$$` where `\(m\)` is the number of seasons. For monthly, `\(m=12\)`, for quarterly `\(m=4\)`. For monthly series `$$\nabla_{12}X_t=X_t-X_{t-12}$$` Twice-differenced series `$$\nabla^2_{12}X_t=\nabla_{12}X_t-\nabla_{12}X_{t-1}=(X_t-X_{t-12})-(X_{t-1}-X_{t-13})$$` If seasonality is strong, the seasonal differencing should be done first. --- .pull-left[ Original series ![](timeseries2_files/figure-html/unnamed-chunk-2-1.png)<!-- -->![](timeseries2_files/figure-html/unnamed-chunk-2-2.png)<!-- --> ] .pull-right[ Differenced series ![](timeseries2_files/figure-html/unnamed-chunk-3-1.png)<!-- -->![](timeseries2_files/figure-html/unnamed-chunk-3-2.png)<!-- --> ] - For a stationary time series, the ACF will drop to zero relatively quickly, while the ACF of non-stationary data decreases slowly. --- .pull-left[ Original series ![](timeseries2_files/figure-html/unnamed-chunk-4-1.png)<!-- -->![](timeseries2_files/figure-html/unnamed-chunk-4-2.png)<!-- --> ] .pull-right[ Differenced series ![](timeseries2_files/figure-html/unnamed-chunk-5-1.png)<!-- -->![](timeseries2_files/figure-html/unnamed-chunk-5-2.png)<!-- --> ] --- .pull-left[ Original series ![](timeseries2_files/figure-html/unnamed-chunk-6-1.png)<!-- -->![](timeseries2_files/figure-html/unnamed-chunk-6-2.png)<!-- --> ] .pull-right[ Seasonal Differencing ![](timeseries2_files/figure-html/unnamed-chunk-7-1.png)<!-- -->![](timeseries2_files/figure-html/unnamed-chunk-7-2.png)<!-- --> ] --- # Linear filter model A linear filter is an operation `\(L\)` which transform the white noise process into another time series `\(\{X_t\}\)`. White noise `\(\epsilon_t\)` `\(\to\)` Linear Filter `\(\psi(B)\)` `\(\to\)` Output `\(X_t\)` --- # Autoregressive models current value = linear combination of past values + current error An autoregressive model of order `\(p\)`, `\(AR(P)\)` model can be written as `$$x_t= c+ \phi_1x_{t-1}+\phi_2x_{t-2}+...+\phi_px_{t-p}+\epsilon_t,$$` where `\(\epsilon_t\)` is white noise. - Similar to multiple linear regression model but with lagged values of `\(x_t\)` as predictors. -- ## Question Show that `\(AR(P)\)` is a linear filter with transfer function `\(\phi^{-1}(B)\)`, where `\(\phi(B)=1-\phi_1B-\phi_2B^2-...-\phi_pB^p.\)` -- ## Stationary condition for AR(P) The roots of `\(\phi(B)=0\)` (characteristic equation) must lie outside the unit circle. --- # Stationarity Conditions for AR(1) Let's consider `\(AR(1)\)` process, `$$x_t=\phi_1x_{t-1}+\epsilon_t.$$` Then, `$$(1-\phi_1 B)x_t=\epsilon_t.$$` This may be written as, `$$x_t=(1-\phi_1 B)^{-1}\epsilon_t=\sum_{j=0}^{\infty}\phi_1^j\epsilon_{t-j}.$$` Hence, `$$\psi(B)=(1-\phi_1 B)^{-1}=\sum_{j=0}^{\infty}\phi_1^jB^j$$` If `\(|\phi_1| < 1,\)` the `\(AR(1)\)` process is stationary. This is equivalent to saying the roots of `\(1-\phi_1B=0\)` must lie outside the unit circle. --- Geometric series `$$a+ar+ar^2+ar^3+...=\frac{a}{1-r}$$` for `\(|r|<1.\)` --- ## AR(1) process `$$x_t=c+\phi_1x_{t-1}+\epsilon_t,$$` - when `\(\phi_1=0\)` - equivalent to white noise process - when `\(\phi_1=0\)` and `\(c=0\)` - random walk - when `\(\phi_1=1\)` and `\(c \neq 0\)` - random walk with drift - when `\(\phi_1 < 0\)` - oscillates around the mean --- Find the mean, variance and the autocorrelation function of a AR(1) process. Help: `$$Cov(X+Y, X+Y)=Var(X+Y)=Var(X)+Var(Y)+2Cov(X, Y)$$` `$$Cov(X, Y)=E[(X-\mu_X)(Y-\mu_Y)]$$` --- ## Question: Properties of AR(2) process Find the Mean, Variance and Autocorrelation function. --- ## Yule-Walker Equations Autoregressive parameters in terms of the autocorrelations. --- ## Autocorrelation function of a AR(P) process The autocorrelation function of AR(P) process `$$\rho_k=\phi_1\rho_{k-1}+\phi_2\rho_{k-2}+...+\phi_p\rho_{k-p}$$` for `\(k>0\)`. This can be written as `$$\phi(B)\rho_k=0,$$` where `\(\phi(B)=1-\phi_1 B-...-\phi_p B^p.\)` For `\(AR(1)\)` process `$$\rho_k=\phi_1\rho_{k-1},$$` for `\(k>0\)`. Since `\(\rho_0=1,\)` for `\(k \geq 1\)`, we get `$$\rho_k=\phi_1^k$$` 1. When `\(\phi_1>0\)`, autocorrelation function decays exponentially to zero. 1. When `\(\phi_1<0\)`, autocorrelation function decays exponentially to zero and oscillates in sign. --- ## Partial Autocorrelation Function Conditional autocorrelation between `\(X_t\)` and `\(X_{t-k}\)` given that `\(X_{t+1}, X_{t+2}, ..., X_{t+k-1}\)`. `$$Cor(X_t, X_{t+k}|X_{t+1}, X_{t+2}, ..., X_{t+k-1})$$` The partial autocorrelation function `\(\phi_{kk}\)` of `\(AR(P)\)` will be non-zero for `\(k\)` less than or equal to `\(p\)` and zero for `\(k\)` greater than `\(p\)`. Calculations: in-class --- ## Moving average models Current value = linear combination of past forecast errors + current error An `\(MA(q)\)` process can be written as `$$x_t=c+\epsilon_t+\theta_1\epsilon_{t-1}+\theta_2\epsilon_{t-2}+...+\theta_p\epsilon_{t-q}$$` where `\(\epsilon_t\)` is white noise. - The current value `\(x_t\)` can be thought of as a weighted moving average of the past few forecast errors. - `\(MA(q)\)` is always stationary. -- - Any `\(AR(p)\)` model can be written as an `\(MA(\infty)\)` model using repeated substitution. --- ## Invertibility condition of an MA model `$$x_t=c+\epsilon_t+\theta_1\epsilon_{t-1}+\theta_2\epsilon_{t-2}+...+\theta_p\epsilon_{t-q}$$` Using back shift operator, `$$x_t=c+\epsilon_t+\theta_1 B \epsilon_{t}+\theta_2 B^2 \epsilon_{t}+...+\theta_p B^q \epsilon_{t}$$` Hence, `$$x_t=\theta_q(B)\epsilon_t+c.$$` The `\(\theta_q\)` is defined as an MA operator. Any `\(MA(q)\)` process can be written as an `\(AR(\infty)\)` process if we impose some constraints on the MA parameters. Then it is called **invertible**. MA model is called invertible if the roots of `\(\theta_q(B)=0\)` lie outside the unit circle. --- ## MA(1) process 1. Obtain the invertibility condition of `\(MA(1)\)` process. 1. Calculate mean, variance, ACF and PACF of `\(MA(1)\)` process. --- ## MA(2) process Calculate mean, variance, ACF and PACF of `\(MA(2)\)` process. --- # ACF and PACF of AR(p) v MA(q) process .pull-left[ # AR(p) **ACF:** decays exponentially to zero. **PACF:** PACF of `\(AR(p)\)` process is zero, beyond the order `\(p\)` of the process/ cut-off after lag `\(p\)`. ] .pull-right[ # MA(q) **ACF:** ACF of `\(MA(q)\)` process is zero, beyond the order `\(q\)` of the process/ cut-off after lag `\(q\)`. **PACF:** decays exponentially to zero. ] --- ## ARMA process current value = linear combination of past values + linear combination of past error + current error The `\(ARMA(p, q)\)` can be written as `$$X_t=c+\phi_1 x_{t-1}+\phi_2 x_{t-2}+...+\phi_px_{t-p}+\theta_1\epsilon_{t-1}+\theta_2\epsilon_{t-2}+...+\theta_q\epsilon_{t-q}+\epsilon_t,$$` where `\(\{\epsilon_t\}\)` is a white noise process. Using the back shift operator `$$\phi(B)x_t=\theta(B)\epsilon_t,$$` where `\(\phi(.)\)` and `\(\theta(.)\)` are the `\(p\)`th and `\(q\)`th degree polynomials, `$$\phi(B)=1-\phi_1\epsilon-...-\phi_p\epsilon^p,$$` and `$$\theta(B)=1+\theta_1\epsilon+...+\theta_q\epsilon^q.$$` --- ## ARMA(p, q) process **Stationary condition** Roots of `\(\phi(B)=0\)` lie outside the unit circle. **Invertible condition** Roots of `\(\theta(B)=0\)` lie outside the unit circle. --- ## Question: Determine which of the following processes are invertible and stationary. 1. `\(x_t=0.6x_{t-1}+\epsilon_t\)` 1. `\(x_t=\epsilon_t-1.3\epsilon_{t-1}+0.4\epsilon_{t-2}\)` 1. `\(x_t=0.6x_{t-1}-1.3\epsilon_{t-1}+0.4\epsilon_{t-2}+\epsilon_t\)` 1. `\(x_t=x_{t-1}-1.3\epsilon_{t-1}+0.3\epsilon_{t-2}+x_t\)` --- ## ARMA(1, 1) model Calculate the mean,variance and ACF of ARMA(1, 1) process. --- ## Non-seasonal ARIMA models - Autoregressive Integrated Moving Average Model `\(x'_t=c+\phi_1x'_{t-1}+...+\phi_px'_{t-p}+\theta_1\epsilon_{t-1}+...+\theta_q\epsilon_{t-q}+\epsilon_t,\)` where `\(x'_t\)` is the difference series. Using the back shift notation `$$(1-\phi_1B-...-\phi_pB^p)(1-B)^dx_t=c+(1+\theta_1B+...+\theta_qB^q)\epsilon_t$$` - We call this an `\(ARIMA(p, d, q)\)` model. - p - order of the autoregressive part - d - degree of first differencing involved - q - order of the moving average part --- # Seasonal ARIMA model ## Seasonal differencing For monthly series `$$(1-B^{12})x_t=x_t-x_{t-12}$$` For quarterly series `$$(1-B^{4})x_t=x_t-x_{t-4}$$` ## Non-seasonal differencing `$$(1-B)x_t=x_t-x_{t-1}$$` ## Multiply terms together `$$(1-B)(1-B^m)x_t=(1-B-B^m+B^{m+1})x_t=x_t-x_{t-1}-x_{t-m}+x_{t-m-1}.$$` --- ## Seasonal ARIMA models `$$ARMA(p, d, q)(P, D, Q)_m$$` - `\((p, d, q)\)` - non-seasonal part of the model - `\((P, D, Q)\)` - seasonal part of the model - `\(m\)` - number of observations per year `$$ARIMA(1,1,1)(1,1,1)_4$$` `$$(1-\phi_1B)(1-\Phi_1B^4)(1-B)(1-B^4)x_t=(1+\theta_1B)(1+\Theta_1B^4)\epsilon_t$$` `\((1-\phi_1B)\)` - Non-seasonal `\(AR(1)\)` `\((1-\Phi_1B^4)\)` - Seasonal `\(AR(1)\)` `\((1-B)\)` - Non-seasonal difference `\((1-B^4)\)` - Deasonal difference `\((1+\theta_1B)\)` - Non-seasonal `\(MA(1)\)` `\((1+\Theta B^4)\)` - Seasonal `\(MA(1)\)` --- ## Question 1. Write down the model for `$$ARIMA(0, 0, 1)(0, 0, 1)_{12}$$` `$$ARIMA(1,0,0)(1,0,0)_{12}$$`