跳转至

11.1.2.   促销对球赛上座率的分析

介绍

论文需求:

图片

部分展示

1 首先将题目的数据读入R的环境,绘制原始时序图,观察序列特征

train <- read.csv("~/Desktop/train.csv", head = TRUE, sep = ";")
train_ts <- ts(train, start = 1993, freq = 12)
plot(train_ts[, 2], type = "l", lwd = 2, col = "blue", xlab = "year", ylab = "train data", 
    main = "Number of trains running from 1993 to 1997", ylim = c(1100, 1350))

![图片](https://file.yoqi.me/assets/images/26181797.png)

2 可以看出原始序列非平稳,呈现出近似线性趋势,所以选择一阶差分,差分后再次绘制新的时序图

# 1st-order differentiation
d_train = ts(diff(train_ts[, 2]))
plot(d_train, type = "l", col = "red", xlab = "year", ylab = expression(Delta ~ 
    "# of trains"), main = "First order differentiation of train numbers")
plot of chunk unnamed-chunk-2

图片

3 时序图显示,差分后序列在均值附近比较稳定地波动.此时可以计算差分后序列得均值和方差

# The differentiated series has been stable, next we calculate mean and
# variance of d_train
mean_d_train = mean(d_train)  #  -0.09830508
var_d_train = var(d_train)  #  3255.08

4 绘制一阶差分序列的自相关和偏自相关图

par(mfrow = c(1, 2))
acf(d_train, 20, xlim = c(1, 20), main = "ACF figure(d=1)")
pacf(d_train, main = "PACF figure(d=1)")
plot of chunk unnamed-chunk-4

图片

5 一阶差分的PACF拖尾,考虑MA模型拟合1阶差分后序列,利用\( \hat{ | \rho_k|} \leq \frac{1}{\sqrt{n}}\sqrt{1+2\sum_{i=1}^{q}\hat{\rho}_i^2} \) ,判断\( {\hat{ | \rho_k|}} \)的截尾性,\( M=\lfloor\sqrt{57} \rfloor=7 \)

\( k=1,\frac{1}{\sqrt{n}}\sqrt{1+2\rho_{1}^2}=0.181 \),在\( {\rho_k(k=2...8)} \)中满足\( \rho_k \leq 0.181 \)的仅为\( 2/7=28.6\% \) \( k=2,\frac{1}{\sqrt{n}}\sqrt{1+2(\rho_{1}^2+\rho_{2}^2)}=0.192 \),在\( {\rho_k(k=3...9)} \)中满足\( \rho_k \leq 0.192 \)的仅为\( 4/7=57.1\% \) \( k=3,\frac{1}{\sqrt{n}}\sqrt{1+2(\rho_{1}^2+\rho_{2}^2+\rho_{3}^2)}=0.195 \),在\( {\rho_k(k=4...10)} \)中满足\( \rho_k \leq 0.192 \)的为\( 4/7=71.4\%>68.3\% \) 因此\( {\rho_k} \)为3步截尾,差分后序列满足MA(3)模型,总模型为ARIMA(0,1,3) 6 根据计算结果对时间序列进行拟合,从Box检验结果可以看出,残差p=0.4742大于0.05,即残差序列属于白噪声序列.同时,从残差的自相关图看出残差项不存在滞后性

fit <- arima(train[, 2], order = c(0, 1, 3))
Box.test(fit$residuals)
## 
##  Box-Pierce test
## 
## data:  fit$residuals
## X-squared = 0.5121, df = 1, p-value = 0.4742
tsdiag(fit)
plot of chunk unnamed-chunk-5

图片

fit
## 
## Call:
## arima(x = train[, 2], order = c(0, 1, 3))
## 
## Coefficients:
##          ma1    ma2     ma3
##       -1.060  0.384  -0.248
## s.e.   0.138  0.152   0.178
## 
## sigma^2 estimated as 1250:  log likelihood = -295.1,  aic = 598.1

7 可以写出ARIMA(0,1,3)的函数表达式为[ (1-B)x_t =\epsilon_t(1+1.06B-0.38B2+0.25B3) ] 8 预测未来一年的火车数量变化趋势

ts.plot(train_ts[, 2], type = "l", lwd = 2, col = "blue", xlab = "year", ylab = "train data", 
    main = "Prediction of train numbers in 1998", xlim = c(1993, 1999), ylim = c(1100, 
        1350))
train.pred <- predict(fit, 12)

lines(ts(train.pred$pred, start = 1998, freq = 12), col = "red")
lines(ts(train.pred$pred + 2 * train.pred$se, start = 1998, freq = 12), col = "red", 
    lty = 3)
lines(ts(train.pred$pred - 2 * train.pred$se, start = 1998, freq = 12), col = "red", 
    lty = 3)
plot of chunk unnamed-chunk-6

图片

版权说明:

本文档版权隶属 天问科技 ,仅用于天问科技旗下公司,团队为客户展示项目案例所用,任何盗用本公司图文,描述,案例的行为均属违法,我们保留追究法律责任的权利。