+ - 0:00:00
Notes for current slide
Notes for next slide

IM532 3.0 Applied Time Series Forecasting

Dynamic regression models

Dr. Thiyanga Talagala

21 June 2020

1 / 22

Introduction

  • ARIMA models

  • ETS models

  • Dynamic regression models: How to include external variables

Regression model

yt=β0+β1x1,t+...+βkxk,t+ϵtyt=β0+β1x1,t+...+βkxk,t+ϵt

Allow the errors from a regression to contain autocorrelation.

yt=β0+β1x1,t+...+βkxk,t+ηtyt=β0+β1x1,t+...+βkxk,t+ηt (1ϕ1B)(1B)ηt=(1+θ1B)ϵt(1ϕ1B)(1B)ηt=(1+θ1B)ϵt

where ϵtϵt is a white noise series.

ηtηt follows an ARIMA(1,1,1)ARIMA(1,1,1)

2 / 22

Stationary variables vs Nonstationary Variables

  • If all of the variables in the model are stationary, then we only need to consider ARMA errors for the residuals

  • Regression model with ARIMA errors is equivalent to a regression model in differences with ARMA errors.

yt=β0+β1x1,t+...+βkxk,t+ηtyt=β0+β1x1,t+...+βkxk,t+ηt

(1ϕ1B)ηt=(1+θ1B)ϵt(1ϕ1B)ηt=(1+θ1B)ϵt

yt=ytyt1yt=ytyt1

xt,i=xt,ixt1,ixt,i=xt,ixt1,i

ηt=ηtηt1ηt=ηtηt1

3 / 22

Regression with ARIMA errors in R

To fit

yt=β1xt+ηt,yt=β1xt+ηt,

where ηt=ϕ1ηt1+ϵtηt=ϕ1ηt1+ϵt is an AR(1)AR(1) error.

This is equivalent to

yt=β0+β1xt+ηtyt=β0+β1xt+ηt where ηtηt is an ARIMA(1,1,0)ARIMA(1,1,0) error.

Constant term disappears due to the differencing.

fit <- Arima(y, xreg=x, order=c(1,1,0))
4 / 22

Using automated ARIMA algorithm

library(forecast)
library(fpp2)
head(uschange)
Consumption Income Production Savings Unemployment
1970 Q1 0.6159862 0.9722610 -2.4527003 4.8103115 0.9
1970 Q2 0.4603757 1.1690847 -0.5515251 7.2879923 0.5
1970 Q3 0.8767914 1.5532705 -0.3587079 7.2890131 0.5
1970 Q4 -0.2742451 -0.2552724 -2.1854549 0.9852296 0.7
1971 Q1 1.8973708 1.9871536 1.9097341 3.6577706 -0.1
1971 Q2 0.9119929 1.4473342 0.9015358 6.0513418 -0.1
5 / 22

Using automated ARIMA algorithm

autoplot(uschange[,1:2], facets=TRUE) +
xlab("Year") + ylab("") +
ggtitle("Quarterly changes in US consumption
and personal income")

6 / 22

Using automated ARIMA algorithm

fit <- auto.arima(uschange[,"Consumption"],
xreg=uschange[,"Income"])
fit
Series: uschange[, "Consumption"]
Regression with ARIMA(1,0,2) errors
Coefficients:
ar1 ma1 ma2 intercept xreg
0.6922 -0.5758 0.1984 0.5990 0.2028
s.e. 0.1159 0.1301 0.0756 0.0884 0.0461
sigma^2 estimated as 0.3219: log likelihood=-156.95
AIC=325.91 AICc=326.37 BIC=345.29

The fitted model is

ˆyt=0.599+0.203xt+ηt^yt=0.599+0.203xt+ηt

ηt=0.692ηt1+ϵt0.576ϵt1+0.198ϵt2ηt=0.692ηt1+ϵt0.576ϵt1+0.198ϵt2

ϵtNID(0,0.322)ϵtNID(0,0.322)

7 / 22

plot ηtηt and ϵtϵt

residuals function can be used to extract ηtηt and ϵtϵt.

library(magrittr)
cbind("Regression Errors" = residuals(fit, type="regression"),
"ARIMA errors" = residuals(fit, type="innovation")) %>%
autoplot(facets=TRUE)

8 / 22

It is the ARIMA errors that should resemble a white noise series.

checkresiduals(fit)

Ljung-Box test
data: Residuals from Regression with ARIMA(1,0,2) errors
Q* = 5.8916, df = 3, p-value = 0.117
Model df: 5. Total lags used: 8
9 / 22

Calculate forecasts

  1. We first need to forecast predictors
fcast <- forecast(fit, xreg=rep(mean(uschange[,2]),8))
autoplot(fcast) + xlab("Year") + ylab("Percentage change")

10 / 22

Forecasting electricity demand

Model daily electricity demand as a function of temperature using quadratic regression with ARMA errors.

11 / 22

Forecasting electricity demand

12 / 22

Forecasting electricity demand

xreg <- cbind(MaxTemp = elecdaily[, "Temperature"],
MaxTempSq = elecdaily[, "Temperature"]^2,
Workday = elecdaily[, "WorkDay"])
fit <- auto.arima(elecdaily[, "Demand"], xreg = xreg)
fit
Series: elecdaily[, "Demand"]
Regression with ARIMA(2,1,2)(2,0,0)[7] errors
Coefficients:
ar1 ar2 ma1 ma2 sar1 sar2 drift MaxTemp
-0.0622 0.6731 -0.0234 -0.9301 0.2012 0.4021 -0.0191 -7.4996
s.e. 0.0714 0.0667 0.0413 0.0390 0.0533 0.0567 0.1091 0.4409
MaxTempSq Workday
0.1789 30.5695
s.e. 0.0084 1.2891
sigma^2 estimated as 43.72: log likelihood=-1200.7
AIC=2423.4 AICc=2424.15 BIC=2466.27

Forecasting

forecast(fit, xreg = cbind(20, 20^2, 1)) # Forecast one day ahead
Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
53.14286 185.4008 176.9271 193.8745 172.4414 198.3602
13 / 22

Forecasting electricity demand (cont.)

checkresiduals(fit)

Ljung-Box test
data: Residuals from Regression with ARIMA(2,1,2)(2,0,0)[7] errors
Q* = 28.229, df = 4, p-value = 1.121e-05
Model df: 10. Total lags used: 14
14 / 22

Lagged predictors

yt=β0+γ0xt+γ1xt1+...+γkxtk+ηtyt=β0+γ0xt+γ1xt1+...+γkxtk+ηt

where ηtηt is an ARIMAARIMA process.

How to select kk ?

Use AICc along with the values of pp and qq for the ARIMA error.

15 / 22
autoplot(insurance, facets=TRUE) +
xlab("Year") + ylab("") +
ggtitle("Insurance advertising and quotations")

16 / 22

Lagged predictors

Advert <- cbind(
AdLag0 = insurance[,"TV.advert"],
AdLag1 = stats::lag(insurance[,"TV.advert"],-1),
AdLag2 = stats::lag(insurance[,"TV.advert"],-2),
AdLag3 = stats::lag(insurance[,"TV.advert"],-3)) %>%
head(NROW(insurance))
head(Advert)
AdLag0 AdLag1 AdLag2 AdLag3
Jan 2002 7.212725 NA NA NA
Feb 2002 9.443570 7.212725 NA NA
Mar 2002 7.534250 9.443570 7.212725 NA
Apr 2002 7.212725 7.534250 9.443570 7.212725
May 2002 9.443570 7.212725 7.534250 9.443570
Jun 2002 6.415215 9.443570 7.212725 7.534250
17 / 22

Lagged predictors (cont.)

fit1 <- auto.arima(insurance[4:40,1], xreg=Advert[4:40,1],
stationary=TRUE)
fit1[["aicc"]]
[1] 68.49968
fit2 <- auto.arima(insurance[4:40,1], xreg=Advert[4:40,1:2],
stationary=TRUE)
fit2[["aicc"]]
[1] 60.02357
fit3 <- auto.arima(insurance[4:40,1], xreg=Advert[4:40,1:3],
stationary=TRUE)
fit3[["aicc"]]
[1] 62.83253
fit4 <- auto.arima(insurance[4:40,1], xreg=Advert[4:40,1:4],
stationary=TRUE)
fit4[["aicc"]]
[1] 65.45747
18 / 22

Lagged predictors (cont.)

Re-estimate the model

fit <- auto.arima(insurance[,1], xreg=Advert[,1:2],
stationary=TRUE); fit
Series: insurance[, 1]
Regression with ARIMA(3,0,0) errors
Coefficients:
ar1 ar2 ar3 intercept AdLag0 AdLag1
1.4117 -0.9317 0.3591 2.0393 1.2564 0.1625
s.e. 0.1698 0.2545 0.1592 0.9931 0.0667 0.0591
sigma^2 estimated as 0.2165: log likelihood=-23.89
AIC=61.78 AICc=65.4 BIC=73.43

The fitted model is

yt=2.039+1.25xt+0.162xt1+ηtyt=2.039+1.25xt+0.162xt1+ηt ηt=1.412ηt10.932ηt2+0.359ηt3+ϵtηt=1.412ηt10.932ηt2+0.359ηt3+ϵt

ytyt - the number of quotations provided in month tt

xtxt - the advertising expenditure in month tt

ϵtϵt is white noise.

19 / 22

Generate forecasts

fc8 <- forecast(fit, h=20,
xreg=cbind(AdLag0 = rep(8,20),
AdLag1 = c(Advert[40,1], rep(8,19))))
autoplot(fc8) + ylab("Quotes") +
ggtitle("Forecast quotes with future advertising set to 8")

20 / 22
checkresiduals(fc8)

Ljung-Box test
data: Residuals from Regression with ARIMA(3,0,0) errors
Q* = 2.0324, df = 3, p-value = 0.5657
Model df: 6. Total lags used: 9
21 / 22

Other approaches

  • Neural network models

  • Machine learning approaches

  • TBATS models

  • Forecasting with decomposition

  • Vector autoregressions

22 / 22

Introduction

  • ARIMA models

  • ETS models

  • Dynamic regression models: How to include external variables

Regression model

yt=β0+β1x1,t+...+βkxk,t+ϵtyt=β0+β1x1,t+...+βkxk,t+ϵt

Allow the errors from a regression to contain autocorrelation.

yt=β0+β1x1,t+...+βkxk,t+ηtyt=β0+β1x1,t+...+βkxk,t+ηt (1ϕ1B)(1B)ηt=(1+θ1B)ϵt(1ϕ1B)(1B)ηt=(1+θ1B)ϵt

where ϵtϵt is a white noise series.

ηtηt follows an ARIMA(1,1,1)ARIMA(1,1,1)

2 / 22
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
s Start & Stop the presentation timer
t Reset the presentation timer
?, h Toggle this help
Esc Back to slideshow