Help for package smooth

Type:

Package

Title:

Forecasting Using State Space Models

Version:

4.5.0

Date:

2026-06-19

URL:

https://github.com/config-i1/smooth

BugReports:

https://github.com/config-i1/smooth/issues

Language:

en-GB

Description:

Functions implementing Single Source of Error state space models for purposes of time series analysis and forecasting. The package includes ADAM (Svetunkov, 2023, https://openforecast.org/adam/), Exponential Smoothing (Hyndman et al., 2008, <doi:10.1007/978-3-540-71918-2>), SARIMA (Svetunkov & Boylan, 2019 <doi:10.1080/00207543.2019.1600764>), Complex Exponential Smoothing (Svetunkov & Kourentzes, 2018, <doi:10.13140/RG.2.2.24986.29123>), Simple Moving Average (Svetunkov & Petropoulos, 2018 <doi:10.1080/00207543.2017.1380326>) and several simulation functions. It also allows dealing with intermittent demand based on the iETS framework (Svetunkov & Boylan, 2019, <doi:10.13140/RG.2.2.35897.06242>).

License:

LGPL-2.1

Depends:

R (≥ 3.0.2), greybox (≥ 2.0.2)

Imports:

Rcpp (≥ 0.12.3), stats, generics (≥ 0.1.2), graphics, grDevices, methods, statmod, MASS, nloptr, utils, xtable, zoo

LinkingTo:

Rcpp, RcppArmadillo (≥ 0.8.100.0.0)

Suggests:

legion, numDeriv, testthat, knitr, rmarkdown, doMC, doParallel, foreach

VignetteBuilder:

knitr

Encoding:

UTF-8

ByteCompile:

true

Config/roxygen2/version:

8.0.0

RoxygenNote:

8.0.0

NeedsCompilation:

yes

Packaged:

2026-06-20 21:10:18 UTC; config

Author:

Ivan Svetunkov [aut, cre] (Senior Lecturer at Centre for Marketing Analytics and Forecasting, Lancaster University, UK)

Maintainer:

Ivan Svetunkov <ivan@svetunkov.com>

Repository:

CRAN

Date/Publication:

2026-06-21 05:10:02 UTC

Smooth package

Description

Package contains functions implementing Single Source of Error state space models for purposes of time series analysis and forecasting.

Details

Package:	smooth
Type:	Package
Date:	2016-01-27 - Inf
License:	GPL-2

The following functions are included in the package:

es - Exponential Smoothing in Single Source of Errors State Space form.
ces - Complex Exponential Smoothing.
gum - Generalised Exponential Smoothing.
ssarima - SARIMA in state space framework.
auto.ces - Automatic selection between seasonal and non-seasonal CES.
auto.ssarima - Automatic selection of ARIMA orders.
sma - Simple Moving Average in state space form.
smoothCombine - the function that combines forecasts from es(), ces(), gum(), ssarima() and sma() functions.
cma - Centered Moving Average. This is for smoothing time series, not for forecasting.
sim.es - simulate time series using ETS as a model.
sim.ces - simulate time series using CES as a model.
sim.ssarima - simulate time series using SARIMA as a model.
sim.gum - simulate time series using GUM as a model.
sim.sma - simulate time series using SMA.
sim.oes - simulate time series based on occurrence part of ETS model.
oes - occurrence part of the intermittent state space model.

There are also several methods implemented in the package for the classes "smooth" and "smooth.sim":

orders - extracts orders of the fitted model.
lags - extracts lags of the fitted model.
modelType - extracts type of the fitted model.
forecast - produces forecast using provided model.
multicov - returns covariance matrix of multiple steps ahead forecast errors.
pls - returns Prediction Likelihood Score.
nparam - returns number of the estimated parameters.
fitted - extracts fitted values from provided model.
getResponse - returns actual values from the provided model.
residuals - extracts residuals of provided model.
plot - plots either states of the model or produced forecast (depending on what object is passed).
simulate - uses sim functions (sim.es, sim.ces, sim.ssarima, sim.gum, sim.sma and sim.oes) in order to simulate data using the provided object.
summary - provides summary of the object.
AICc, BICc - return, guess what...

Author(s)

Ivan Svetunkov, ivan@svetunkov.com

References

Svetunkov, I., 2023. Smooth Forecasting with the Smooth Package in R. arXiv. doi:10.48550/arXiv.2301.01790
Snyder, R. D., 1985. Recursive Estimation of Dynamic Linear Models. Journal of the Royal Statistical Society, Series B (Methodological) 47 (2), 272-276.
Hyndman, R.J., Koehler, A.B., Ord, J.K., and Snyder, R.D. (2008) Forecasting with exponential smoothing: the state space approach, Springer-Verlag. doi:10.1007/978-3-540-71918-2.

Svetunkov, I., Boylan, J.E., 2023a. iETS: State Space Model for Intermittent Demand Forecastings. International Journal of Production Economics. 109013. doi:10.1016/j.ijpe.2023.109013
Teunter R., Syntetos A., Babai Z. (2011). Intermittent demand: Linking forecasting to inventory obsolescence. European Journal of Operational Research, 214, 606-615.
Croston, J. (1972) Forecasting and stock control for intermittent demands. Operational Research Quarterly, 23(3), 289-303.

Svetunkov, I., Kourentzes, N., & Ord, J. K. (2022). Complex exponential smoothing. Naval Research Logistics, 69(8), 1108–1123. https://doi.org/10.1002/nav.22074

Svetunkov I. (2023) Smooth forecasting with the smooth package in R. arXiv:2301.01790. doi:10.48550/arXiv.2301.01790.
Svetunkov I. (2015 - Inf) "smooth" package for R - series of posts about the underlying models and how to use them: https://openforecast.org/category/r-en/smooth/.

Kolassa, S. (2011) Combining exponential smoothing forecasts using Akaike weights. International Journal of Forecasting, 27, pp 238 - 251.
Svetunkov, I., Boylan, J.E., 2023b. Staying Positive: Challenges and Solutions in Using Pure Multiplicative ETS Models. IMA Journal of Management Mathematics. p. 403-425. doi:10.1093/imaman/dpad028

Taylor, J.W. and Bunn, D.W. (1999) A Quantile Regression Approach to Generating Prediction Intervals. Management Science, Vol 45, No 2, pp 225-237.
Lichtendahl Kenneth C., Jr., Grushka-Cockayne Yael, Winkler Robert L., (2013) Is It Better to Average Probabilities or Quantiles? Management Science 59(7):1594-1611. DOI: doi:10.1287/mnsc.1120.1667

Examples


y <- ts(rnorm(100,10,3), frequency=12)

adam(y, h=20, holdout=TRUE)
es(y, h=20, holdout=TRUE)
gum(y, h=20, holdout=TRUE)
auto.ces(y, h=20, holdout=TRUE)
auto.ssarima(y, h=20, holdout=TRUE)

Error measures for an estimated model

Description

Function produces error measures for the provided object and the holdout values of the response variable. Note that instead of parameters x, test, the function accepts the vector of values in holdout. Also, the parameters d and D are not supported - MASE is always calculated via division by first differences.

Usage

## S3 method for class 'smooth'
accuracy(object, holdout = NULL, ...)

## S3 method for class 'smooth.forecast'
accuracy(object, holdout = NULL, ...)

Arguments

object

The estimated model or a forecast from the estimated model generated via either predict() or forecast() functions.

holdout

The vector of values of the response variable in the holdout (test) set. If not provided, then the function will return the in-sample error measures. If the holdout=TRUE parameter was used in the estimation of a model, the holdout values will be extracted automatically.

...

Other variables passed to the forecast() function (e.g. newdata).

Details

The function is a wrapper for the measures function and is implemented for convenience.

Author(s)

Ivan Svetunkov, ivan@svetunkov.com

Examples


y <- rnorm(100, 100, 10)
ourModel <- adam(y, holdout=TRUE, h=10)
accuracy(ourModel)

ADAM is Augmented Dynamic Adaptive Model

Description

Function constructs an advanced Single Source of Error model, based on ETS taxonomy and ARIMA elements

Usage

adam(data, model = "ZXZ", lags = c(frequency(data)), orders = list(ar =
  c(0), i = c(0), ma = c(0), select = FALSE), constant = FALSE,
  formula = NULL, regressors = c("use", "select", "adapt"),
  occurrence = c("none", "auto", "fixed", "general", "odds-ratio",
  "inverse-odds-ratio", "direct"), distribution = c("default", "dnorm",
  "dlaplace", "ds", "dgnorm", "dlnorm", "dinvgauss", "dgamma"),
  loss = c("likelihood", "MSE", "MAE", "HAM", "LASSO", "RIDGE", "MSEh",
  "TMSE", "GTMSE", "MSCE", "GPL"), outliers = c("ignore", "use", "select"),
  level = 0.99, h = 0, holdout = FALSE, persistence = NULL,
  phi = NULL, initial = c("backcasting", "optimal", "two-stage",
  "complete"), arma = NULL, ic = c("AICc", "AIC", "BIC", "BICc"),
  bounds = c("usual", "admissible", "none"), silent = TRUE,
  ets = c("conventional", "adam"), ...)

## S3 method for class 'adam'
simulate(object, nsim = 1, seed = NULL,
  obs = nobs(object), ...)

auto.adam(data, model = "ZXZ", lags = c(frequency(data)),
  orders = list(ar = c(3, 3), i = c(2, 1), ma = c(3, 3), select = TRUE),
  formula = NULL, regressors = c("use", "select", "adapt"),
  occurrence = c("none", "auto", "fixed", "general", "odds-ratio",
  "inverse-odds-ratio", "direct"), distribution = c("dnorm", "dlaplace",
  "ds", "dgnorm", "dlnorm", "dinvgauss", "dgamma"), outliers = c("ignore",
  "use", "select"), level = 0.99, h = 0, holdout = FALSE,
  persistence = NULL, phi = NULL, initial = c("backcasting", "optimal",
  "two-stage", "complete"), arma = NULL, ic = c("AICc", "AIC", "BIC",
  "BICc"), bounds = c("usual", "admissible", "none"), silent = TRUE,
  parallel = FALSE, ets = c("conventional", "adam"), ...)

## S3 method for class 'adam'
sm(object, model = "YYY", lags = NULL, orders = list(ar =
  c(0), i = c(0), ma = c(0), select = FALSE), constant = FALSE,
  formula = NULL, regressors = c("use", "select", "adapt"), data = NULL,
  persistence = NULL, phi = NULL, initial = c("optimal", "backcasting"),
  arma = NULL, ic = c("AICc", "AIC", "BIC", "BICc"), bounds = c("usual",
  "admissible", "none"), silent = TRUE, ...)

Arguments

data

Vector, containing data needed to be forecasted. If a matrix (or data.frame / data.table) is provided, then the first column is used as a response variable, while the rest of the matrix is used as a set of explanatory variables. formula can be used in the latter case in order to define what relation to have.

model

The type of ETS model. The first letter stands for the type of the error term ("A" or "M"), the second (and sometimes the third as well) is for the trend ("N", "A", "Ad", "M" or "Md"), and the last one is for the type of seasonality ("N", "A" or "M"). In case of several lags, the seasonal components are assumed to be the same. The model is then printed out as ETS(M,Ad,M)[m1,m2,...], where m1, m2, ... are the lags specified by the lags parameter. There are several options for the model besides the conventional ones, which rely on information criteria:

model="ZZZ" means that the model will be selected based on the chosen information criteria type. The Branch and Bound is used in the process.
model="XXX" means that only additive components are tested, using Branch and Bound.
model="YYY" implies selecting between multiplicative components.
model="CCC" triggers the combination of forecasts of models using information criteria weights (Kolassa, 2011).
combinations between these four and the classical components are also accepted. For example, model="CAY" will combine models with additive trend and either none or multiplicative seasonality.
model="PPP" will produce the selection between pure additive and pure multiplicative models. "P" stands for "Pure". This cannot be mixed with other types of components.
model="FFF" will select between all the 30 types of models. "F" stands for "Full". This cannot be mixed with other types of components.
model="SSS" creates a pool of 19 standard sensible models, which have finite variance. This can be combined with "X" or "Y" to further restrict the pool. For example, model="SXS" corresponds to the default pool of 15 models in the ets() function from the forecast package.
The parameter model can also be a vector of names of models for a finer tuning (pool of models). For example, model=c("ANN","AAA") will estimate only two models and select the best of them.

Also, model can accept a previously estimated adam and use all its parameters.

Keep in mind that model selection with "Z" components uses Branch and Bound algorithm and may skip some models that could have slightly smaller information criteria. If you want to do a exhaustive search, you would need to list all the models to check as a vector.

The default value is set to "ZXZ", because the multiplicative trend is explosive and dangerous. It should be used only for each separate time series, not for the automated predictions for big datasets.

lags

Defines lags for the corresponding components. All components count, starting from level, so ETS(M,M,M) model for monthly data will have lags=c(1,1,12). However, the function will also accept lags=c(12), assuming that the lags 1 were dropped. In case of ARIMA, lags specify what should be the seasonal component lag. e.g. lags=c(1,12) will lead to the seasonal ARIMA with m=12. This can accept several lags, supporting multiple seasonal ETS and ARIMA models.

orders

The order of ARIMA to be included in the model. This should be passed either as a vector (in which case the non-seasonal ARIMA is assumed) or as a list of a type orders=list(ar=c(p,P),i=c(d,D),ma=c(q,Q)), in which case the lags variable is used in order to determine the seasonality m. See msarima for details. In addition, orders accepts one more parameter: orders=list(select=FALSE). If TRUE, then the function will select the most appropriate order using a mechanism similar to auto.msarima(), but implemented in auto.adam(). The values list(ar=...,i=...,ma=...) specify the maximum orders to check in this case.

constant

Logical, determining, whether the constant is needed in the model or not. This is mainly needed for ARIMA part of the model, but can be used for ETS as well. In case of pure regression, this is completely ignored (use formula instead).

formula

Formula to use in case of explanatory variables. If NULL, then all the variables are used as is. Can also include trend, which would add the global trend. Only needed if data is a matrix or if trend is provided.

regressors

The variable defines what to do with the provided explanatory variables: "use" means that all of the data should be used, while "select" means that a selection using ic should be done, "adapt" will trigger the mechanism of time varying parameters for the explanatory variables.

occurrence

The type of model used in probability estimation. Can be "none" - none, "fixed" - constant probability, "general" - the general Beta model with two parameters, "odds-ratio" - the Odds-ratio model with b=1 in Beta distribution, "inverse-odds-ratio" - the model with a=1 in Beta distribution, "direct" - the TSB-like (Teunter et al., 2011) probability update mechanism a+b=1, "auto" - the automatically selected type of occurrence model.

The type of model used in the occurrence is equal to the one provided in the model parameter.

Also, a model produced using oes or alm function can be used here.

distribution

what density function to assume for the error term. The full name of the distribution should be provided, starting with the letter "d" - "density". The names align with the names of distribution functions in R. For example, see dnorm. For detailed explanation of available distributions, see vignette in greybox package: vignette("greybox","alm").

loss

The type of Loss Function used in optimization. loss can be:

likelihood - the model is estimated via the maximisation of the likelihood of the function specified in distribution;
MSE (Mean Squared Error),
MAE (Mean Absolute Error),
HAM (Half Absolute Moment),
LASSO - use LASSO to shrink the parameters of the model;
RIDGE - use RIDGE to shrink the parameters of the model;
TMSE - Trace Mean Squared Error,
GTMSE - Geometric Trace Mean Squared Error,
MSEh - optimisation using only h-steps ahead error,
MSCE - Mean Squared Cumulative Error,
GPL - Generalised Predictive Likelihood.

In case of LASSO / RIDGE, the variables are not normalised prior to the estimation, but the parameters are divided by the mean values of explanatory variables.

Note that model selection and combination works properly only for the default loss="likelihood".

Furthermore, just for fun the absolute and half analogues of multistep estimators are available: MAEh, TMAE, GTMAE, MACE, HAMh, THAM, GTHAM, CHAM.

Last but not least, user can provide their own function here as well, making sure that it accepts parameters actual, fitted and B. Here is an example:

lossFunction <- function(actual, fitted, B) return(mean(abs(actual-fitted)))

loss=lossFunction

outliers

Defines what to do with outliers: "ignore", so just returning the model, "use" - detect outliers based on specified level and include dummies for them in the model, or detect and "select" those of them that reduce ic value.

level

What confidence level to use for detection of outliers. The default is 99%. The specific bounds of confidence interval depend on the distribution used in the model.

h

The forecast horizon. Mainly needed for the multistep loss functions.

holdout

Logical. If TRUE, then the holdout of the size h is taken from the data (can be used for the model testing purposes).

persistence

Persistence vector g, containing smoothing parameters. If NULL, then estimated. Can be also passed as a names list of the type: persistence=list(level=0.1, trend=0.05, seasonal=c(0.1,0.2), xreg=c(0.1,0.2)). Dropping some elements from the named list will make the function estimate them. e.g. if you don't specify seasonal in the persistence for the ETS(M,N,M) model, it will be estimated.

phi

Value of damping parameter. If NULL then it is estimated. Only applicable for damped-trend models.

initial

Can be either character or a list, or a vector of initial states. If it is character, then it can be "backcasting", meaning that the initials of dynamic part of the model are produced using backcasting procedure (advised for data with high frequency), or "optimal", meaning that all initial states are optimised, or "two-stage", meaning that optimisation is done after the backcasting, refining the states. In case of backcasting, the parameters of the explanatory variables are optimised. This is recommended for ETSX and ARIMAX models. Alternatively, you can set initial="complete" backcasting, which means that all states (including explanatory variables) are initialised via backcasting.

If a use provides a list of values, it is recommended to use the named one and to provide the initial components that are available. For example: initial=list(level=1000,trend=10,seasonal=list(c(1,2),c(1,2,3,4)), arima=1,xreg=100). If some of the components are needed by the model, but are not provided in the list, they will be estimated. If the vector is provided, then it is expected that the components will be provided in the same order as above, one after another without any gaps.

arma

Either the named list or a vector with AR / MA parameters ordered lag-wise. The number of elements should correspond to the specified orders e.g. orders=list(ar=c(1,1),ma=c(1,1)), lags=c(1,4), arma=list(ar=c(0.9,0.8),ma=c(-0.3,0.3))

ic

The information criterion to use in the model selection / combination procedure.

bounds

The type of bounds for the persistence to use in the model estimation. Can be either admissible - guaranteeing the stability of the model, usual - restricting the values with (0, 1) or none - no restrictions (potentially dangerous).

silent

Specifies, whether to provide the progress of the function or not. If TRUE, then the function will print what it does and how much it has already done.

ets

Parameter determining, which ETS formulation to use. If ets="conventional", the one from Hyndman et al. (2008) is used. In case of ets="adam", ADAM reformulation that updates multiplicative components differently is used. The latter is closer to applying ETS to log-transformed data when multiplicative components are used. This helps makig trend less explosive.

...

Other non-documented parameters. For example, FI=TRUE will make the function also produce Fisher Information matrix, which then can be used to calculated variances of smoothing parameters and initial states of the model. This is calculated based on the hessian of log-likelihood function and accepts stepSize parameter, determining how it is calculated. The default value is stepSize=.Machine$double.eps^(1/4). This is used in the vcov method. Number of iterations inside the backcasting loop to do is regulated with nIterations parameter. By default it is set to 2. Furthermore, starting values of parameters can be passed via B, while the upper and lower bounds should be passed in ub and lb respectively. In this case they will be used for optimisation. These values should have the length equal to the number of parameters to estimate in the following order:

All smoothing parameters (for the states and then for the explanatory variables);
Damping parameter (if needed);
ARMA parameters;
All the initial values (for the states and then for the explanatory variables).

You can also pass parameters to the optimiser in order to fine tune its work:

maxeval - maximum number of evaluations to carry out. The default is 40 per estimated parameter for ETS and / or ARIMA and at least 1000 if explanatory variables are introduced in the model (100 per parameter for explanatory variables, but not less than 1000);
maxtime - stop, when the optimisation time (in seconds) exceeds this;
xtol_rel - the relative precision of the optimiser (the default is 1E-6);
xtol_abs - the absolute precision of the optimiser (the default is 1E-8);
ftol_rel - the stopping criterion in case of the relative change in the loss function (the default is 1E-8);
ftol_abs - the stopping criterion in case of the absolute change in the loss function (the default is 0 - not used);
algorithm - the algorithm to use in optimisation (by default, "NLOPT_LN_NELDERMEAD" is used);
print_level - the level of output for the optimiser (0 by default). If equal to 41, then the detailed results of the optimisation are returned.

You can read more about these parameters by running the function nloptr.print.options. It is also possible to regulate what smoother to use to get initial seasonal indices from the msdecompose function via the smoother parameter. The default value is smoother="lowess". Finally, the parameter lambda for LASSO / RIDGE, alpha for the Asymmetric Laplace, shape for the Generalised Normal and nu for Student's distributions can be provided here as well.

object

The model previously estimated using adam() function.

nsim

Number of series to generate from the model.

seed

Random seed used in simulation of data.

obs

Number of observations to produce in the simulated data.

parallel

If TRUE, the estimation of ADAM models is done in parallel (used in auto.adam only). If the number is provided (e.g. parallel=41), then the specified number of cores is set up. WARNING! Packages foreach and either doMC (Linux and Mac only) or doParallel are needed in order to run the function in parallel.

Details

Function estimates ADAM in a form of the Single Source of Error state space model of the following type:

y_{t} = o_t (w(v_{t-l}) + h(x_t, a_{t-1}) + r(v_{t-l}) \epsilon_{t})

v_{t} = f(v_{t-l}, a_{t-1}) + g(v_{t-l}, a_{t-1}, x_{t}) \epsilon_{t}

Where o_{t} is the Bernoulli distributed random variable (in case of normal data it equals to 1 for all observations), v_{t} is the state vector and l is the vector of lags, x_t is the vector of exogenous variables. w(.) is the measurement function, r(.) is the error function, f(.) is the transition function, g(.) is the persistence function and a_t is the vector of parameters for exogenous variables. Finally, \epsilon_{t} is the error term.

The implemented model allows introducing several seasonal states and supports intermittent data via the occurrence variable.

The error term \epsilon_t can follow different distributions, which are regulated via the distribution parameter. This includes:

default - Normal distribution is used for the Additive error models, Gamma is used for the Multiplicative error models.
dnorm - Normal distribution,
dlaplace - Laplace distribution,
ds - S distribution,
dgnorm - Generalised Normal distribution,
dlnorm - Log-Normal distribution,
dgamma - Gamma distribution,
dinvgauss - Inverse Gaussian distribution,

For some more information about the model and its implementation, see the vignette: vignette("adam","smooth"). The more detailed explanation of ADAM is provided by Svetunkov (2021).

The function auto.adam() tries out models with the specified distributions and returns the one with the most suitable one based on selected information criterion.

sm.adam method estimates the scale model for the already estimated adam. In order for ADAM to take the SM model into account, the latter needs to be recorded in the former, amending the likelihood and the number of degrees of freedom. This can be done using implant method.

Value

Object of class "adam" is returned. It contains the list of the following values:

model - the name of the constructed model,
timeElapsed - the time elapsed for the estimation of the model,
data - the in-sample part of the data used for the training of the model. Includes the actual values in the first column,
holdout - the holdout part of the data, excluded for purposes of model evaluation,
fitted - the vector of fitted values,
residuals - the vector of residuals,
forecast - the point forecast for h steps ahead (by default NA is returned). NOTE that these do not always correspond to the conditional expectations for ETS models. See ADAM textbook, Section 6.4. for details (https://openforecast.org/adam/ETSTaxonomyMaths.html),
states - the matrix of states with observations in rows and states in columns,
persisten - the vector of smoothing parameters,
phi - the value of damping parameter,
transition - the transition matrix,
measurement - the measurement matrix with observations in rows and state elements in columns,
initial - the named list of initial values, including level, trend, seasonal, ARIMA and xreg components,
initialEstimated - the named vector, defining which of the initials were estimated in the model,
initialType - the type of initialisation used (backcasting/optimal/two-stage/complete/provided),
orders - the orders of ARIMA used in the estimation,
constant - the value of the constant (if it was included),
arma - the list of AR / MA parameters used in the model,
nParam - the matrix of the estimated / provided parameters,
occurrence - the oes model used for the occurrence part of the model,
formula - the formula used for the explanatory variables expansion,
loss - the type of loss function used in the estimation,
lossValue - the value of that loss function,
logLik - the value of the log-likelihood,
distribution - the distribution function used in the calculation of the likelihood,
scale - the value of the scale parameter,
lambda - the value of the parameter used in LASSO / dalaplace / dt,
B - the vector of all estimated parameters,
lags - the vector of lags used in the model construction,
lagsAll - the vector of the internal lags used in the model,
profile - the matrix with the profile used in the construction of the model,
profileInitial - the matrix with the initial profile (for the before the sample values),
call - the call used in the evaluation,
bounds - the type of bounds used in the process,
res - result of the model estimation, the output of the nloptr() function, explaining how optimisation went,
other - the list with other parameters, such as shape for distributions or ARIMA polynomials.

Author(s)

Ivan Svetunkov, ivan@svetunkov.com

References

Svetunkov I. (2023) Smooth forecasting with the smooth package in R. arXiv:2301.01790. doi:10.48550/arXiv.2301.01790.
Svetunkov I. (2015 - Inf) "smooth" package for R - series of posts about the underlying models and how to use them: https://openforecast.org/category/r-en/smooth/.

Svetunkov, I. (2023). Forecasting and Analytics with the Augmented Dynamic Adaptive Model (ADAM) (1st ed.). Chapman and Hall/CRC. doi:10.1201/9781003452652, online version: https://openforecast.org/adam/.

Svetunkov, I., 2023. Smooth Forecasting with the Smooth Package in R. arXiv. doi:10.48550/arXiv.2301.01790
Snyder, R. D., 1985. Recursive Estimation of Dynamic Linear Models. Journal of the Royal Statistical Society, Series B (Methodological) 47 (2), 272-276.
Hyndman, R.J., Koehler, A.B., Ord, J.K., and Snyder, R.D. (2008) Forecasting with exponential smoothing: the state space approach, Springer-Verlag. doi:10.1007/978-3-540-71918-2.

Svetunkov, I., Boylan, J.E., 2023a. iETS: State Space Model for Intermittent Demand Forecastings. International Journal of Production Economics. 109013. doi:10.1016/j.ijpe.2023.109013
Teunter R., Syntetos A., Babai Z. (2011). Intermittent demand: Linking forecasting to inventory obsolescence. European Journal of Operational Research, 214, 606-615.
Croston, J. (1972) Forecasting and stock control for intermittent demands. Operational Research Quarterly, 23(3), 289-303.

Kolassa, S. (2011) Combining exponential smoothing forecasts using Akaike weights. International Journal of Forecasting, 27, pp 238 - 251.
Svetunkov, I., Boylan, J.E., 2023b. Staying Positive: Challenges and Solutions in Using Pure Multiplicative ETS Models. IMA Journal of Management Mathematics. p. 403-425. doi:10.1093/imaman/dpad028

Examples


### The main examples are provided in the adam vignette, check it out via:
## Not run: vignette("adam","smooth")

# Model selection using a specified pool of models
ourModel <- adam(rnorm(100,100,10), model=c("ANN","ANA","AAA"), lags=c(5,10))
adamSummary <- summary(ourModel)
xtable(adamSummary)

forecast(ourModel)
par(mfcol=c(3,4))
plot(ourModel, c(1:11))

# Model combination using a specified pool
ourModel <- adam(rnorm(100,100,10), model=c("ANN","AAN","MNN","CCC"),
                          lags=c(5,10))

# ADAM ARIMA
ourModel <- adam(rnorm(100,100,10), model="NNN",
                          lags=c(1,4), orders=list(ar=c(1,0),i=c(1,0),ma=c(1,1)))

# Fit ADAM to the data
ourModel <- adam(rnorm(100,100,10), model="AAdN")
# Simulate the data
x <- simulate(ourModel)
# Automatic selection of appropriate distribution and orders of ADAM ETS+ARIMA
ourModel <- auto.adam(rnorm(100,100,10), model="ZZN", lags=c(1,4),
                      orders=list(ar=c(2,2),ma=c(2,2),select=TRUE))

Automatic Occurrence Model Selection

Description

Fits om for each supplied occurrence type and returns the model with the lowest information criterion.

Usage

auto.om(data, model = "ZXZ", lags = c(frequency(data)), orders = list(ar
  = c(3, 3), i = c(2, 1), ma = c(3, 3), select = TRUE), formula = NULL,
  regressors = c("use", "select", "adapt"), occurrence = c("fixed",
  "odds-ratio", "inverse-odds-ratio", "direct", "general"), h = 0,
  holdout = FALSE, persistence = NULL, phi = NULL,
  initial = c("backcasting", "optimal", "two-stage", "complete"),
  arma = NULL, ic = c("AICc", "AIC", "BIC", "BICc"), bounds = c("usual",
  "admissible", "none"), silent = TRUE, parallel = FALSE,
  ets = c("conventional", "adam"), ...)

Arguments

data

Numeric vector, time series, or data frame. Non-binary input is automatically binarised: any non-zero value becomes 1.

model

Three-letter ETS specification (wildcards "Z"/"X"/"Y" supported).

lags

Vector of seasonal lags. Defaults to frequency(data).

orders

ARIMA orders: list(ar, i, ma, select=TRUE/FALSE). When select=TRUE, ARIMA orders are selected automatically for each occurrence type.

formula

Optional formula for external regressors.

regressors

How to handle regressors: "use", "select", or "adapt".

occurrence

Character vector of occurrence link types to try: "fixed", "odds-ratio", "inverse-odds-ratio", "direct".

h

Forecast horizon.

holdout

If TRUE, a holdout sample of size h is withheld.

persistence

Optional persistence (smoothing) parameter vector.

phi

Optional damping parameter.

initial

Initialisation method: "backcasting", "optimal", "two-stage", or "complete".

arma

Optional fixed ARMA parameters.

ic

Information criterion used for selection.

bounds

Parameter bounds type.

silent

If TRUE, suppresses progress messages and plots.

parallel

If TRUE (or a core count), fit occurrence types in parallel. Requires foreach and doMC / doParallel.

ets

ETS flavour passed to om: "conventional" or "adam".

...

Additional arguments forwarded to om.

Value

The best om object (lowest IC), with $call set to the auto.om() call and $timeElapsed recording wall time.

Examples

set.seed(42)
y <- rbinom(120, 1, 0.6)
m <- auto.om(y, occurrence = c("fixed", "odds-ratio"))

Complex Exponential Smoothing

Description

Function estimates CES in state space form with information potential equal to errors and returns several variables.

Usage

ces(y, seasonality = c("none", "simple", "partial", "full"),
  lags = c(frequency(y)), initial = c("backcasting", "optimal",
  "two-stage", "complete"), a = NULL, b = NULL, loss = c("likelihood",
  "MSE", "MAE", "HAM", "MSEh", "TMSE", "GTMSE", "MSCE", "GPL"), h = 0,
  holdout = FALSE, bounds = c("admissible", "none"), silent = TRUE,
  model = NULL, xreg = NULL, regressors = c("use", "select", "adapt"),
  initialX = NULL, ...)

auto.ces(y, seasonality = c("none", "simple", "partial", "full"),
  lags = c(frequency(y)), initial = c("backcasting", "optimal",
  "two-stage", "complete"), ic = c("AICc", "AIC", "BIC", "BICc"),
  loss = c("likelihood", "MSE", "MAE", "HAM", "MSEh", "TMSE", "GTMSE",
  "MSCE", "GPL"), h = 0, holdout = FALSE, bounds = c("admissible",
  "none"), silent = TRUE, xreg = NULL, regressors = c("use", "select",
  "adapt"), ...)

Arguments

y

Vector or ts object, containing data needed to be forecasted.

seasonality

The type of seasonality used in CES. Can be: none - No seasonality; simple - Simple seasonality, using lagged CES (based on t-m observation, where m is the seasonality lag); partial - Partial seasonality with the real seasonal component (equivalent to additive seasonality); full - Full seasonality with complex seasonal component (can do both multiplicative and additive seasonality, depending on the data). First letter can be used instead of full words.

In case of the auto.ces() function, this parameter defines which models to try.

lags

Vector of lags to use in the model. Allows defining multiple seasonal models.

initial

Can be either character or a list, or a vector of initial states. If it is character, then it can be "backcasting", meaning that the initials of dynamic part of the model are produced using backcasting procedure (advised for data with high frequency), or "optimal", meaning that all initial states are optimised, or "two-stage", meaning that optimisation is done after the backcasting, refining the states. In case of backcasting, the parameters of the explanatory variables are optimised. Alternatively, you can set initial="complete" backcasting, which means that all states (including explanatory variables) are initialised via backcasting.

a

First complex smoothing parameter. Should be a complex number.

NOTE! CES is very sensitive to a and b values so it is advised either to leave them alone, or to use values from previously estimated model.

b

Second complex smoothing parameter. Can be real if seasonality="partial". In case of seasonality="full" must be complex number.

loss

The type of Loss Function used in optimization. loss can be: likelihood (assuming Normal distribution of error term), MSE (Mean Squared Error), MAE (Mean Absolute Error), HAM (Half Absolute Moment), TMSE - Trace Mean Squared Error, GTMSE - Geometric Trace Mean Squared Error, MSEh - optimisation using only h-steps ahead error, MSCE - Mean Squared Cumulative Error. If loss!="MSE", then likelihood and model selection is done based on equivalent MSE. Model selection in this cases becomes not optimal.

There are also available analytical approximations for multistep functions: aMSEh, aTMSE and aGTMSE. These can be useful in cases of small samples.

Finally, just for fun the absolute and half analogues of multistep estimators are available: MAEh, TMAE, GTMAE, MACE, TMAE, HAMh, THAM, GTHAM, CHAM.

h

Length of forecasting horizon.

holdout

If TRUE, holdout sample of size h is taken from the end of the data.

bounds

The type of bounds for the persistence to use in the model estimation. Can be either admissible - guaranteeing the stability of the model, or none - no restrictions (potentially dangerous).

silent

accepts TRUE and FALSE. If FALSE, the function will print its progress and produce a plot at the end.

model

A previously estimated GUM model, if provided, the function will not estimate anything and will use all its parameters.

xreg

The vector (either numeric or time series) or the matrix (or data.frame) of exogenous variables that should be included in the model. If matrix included than columns should contain variables and rows - observations. Note that xreg should have number of observations equal either to in-sample or to the whole series. If the number of observations in xreg is equal to in-sample, then values for the holdout sample are produced using es function.

regressors

The variable defines what to do with the provided xreg: "use" means that all of the data should be used, while "select" means that a selection using ic should be done.

initialX

The vector of initial parameters for exogenous variables. Ignored if xreg is NULL.

...

Other non-documented parameters. See adam for details. However, there are several unique parameters passed to the optimiser in comparison with adam: 1. algorithm0, which defines what algorithm to use in nloptr for the initial optimisation. By default, this is "NLOPT_LN_BOBYQA". 2. algorithm determines the second optimiser. By default this is "NLOPT_LN_NELDERMEAD". 3. maxeval0 and maxeval, that determine the number of iterations for the two optimisers. By default, maxeval0=maxeval=40*k, where k is the number of estimated parameters. 4. xtol_rel0 and xtol_rel, which are 1e-8 and 1e-6 respectively. There are also ftol_rel0, ftol_rel, ftol_abs0 and ftol_abs, which by default are set to values explained in the nloptr.print.options() function.

ic

The information criterion to use in the model selection.

Details

The function estimates Complex Exponential Smoothing in the state space form described in Svetunkov et al. (2022) with the information potential equal to the approximation error.

The auto.ces() function implements the automatic seasonal component selection based on information criteria.

ces() uses two optimisers to get good estimates of parameters. By default these are BOBYQA and then Nelder-Mead. This can be regulated via ... - see details below.

For some more information about the model and its implementation, see the vignette: vignette("ces","smooth")

Value

Object of class "adam" is returned with similar elements to the adam function.

Author(s)

Ivan Svetunkov, ivan@svetunkov.com

References

Svetunkov I. (2023) Smooth forecasting with the smooth package in R. arXiv:2301.01790. doi:10.48550/arXiv.2301.01790.
Svetunkov I. (2015 - Inf) "smooth" package for R - series of posts about the underlying models and how to use them: https://openforecast.org/category/r-en/smooth/.

Svetunkov, I. (2023). Forecasting and Analytics with the Augmented Dynamic Adaptive Model (ADAM) (1st ed.). Chapman and Hall/CRC. doi:10.1201/9781003452652, online version: https://openforecast.org/adam/.

Svetunkov, I., 2023. Smooth Forecasting with the Smooth Package in R. arXiv. doi:10.48550/arXiv.2301.01790
Snyder, R. D., 1985. Recursive Estimation of Dynamic Linear Models. Journal of the Royal Statistical Society, Series B (Methodological) 47 (2), 272-276.
Hyndman, R.J., Koehler, A.B., Ord, J.K., and Snyder, R.D. (2008) Forecasting with exponential smoothing: the state space approach, Springer-Verlag. doi:10.1007/978-3-540-71918-2.

Svetunkov, I., Kourentzes, N., & Ord, J. K. (2022). Complex exponential smoothing. Naval Research Logistics, 69(8), 1108–1123. https://doi.org/10.1002/nav.22074

Examples

y <- rnorm(100,10,3)
ces(y, h=20, holdout=FALSE)

y <- 500 - c(1:100)*0.5 + rnorm(100,10,3)
ces(y, h=20, holdout=TRUE)

ces(BJsales, h=8, holdout=TRUE)

ces(AirPassengers, h=18, holdout=TRUE, seasonality="s")
ces(AirPassengers, h=18, holdout=TRUE, seasonality="p")
ces(AirPassengers, h=18, holdout=TRUE, seasonality="f")

y <- ts(rnorm(100,10,3),frequency=12)
# CES with and without holdout
auto.ces(y,h=20,holdout=TRUE)
auto.ces(y,h=20,holdout=FALSE)


# Selection between "none" and "full" seasonalities
auto.ces(AirPassengers, h=12, holdout=TRUE,
                   seasonality=c("n","f"), ic="AIC")

ourModel <- auto.ces(AirPassengers)

summary(ourModel)
forecast(ourModel, h=12)

Centered Moving Average

Description

Function constructs centered moving average based on state space SMA

Usage

cma(y, order = NULL, silent = TRUE, ...)

Arguments

y

Vector or ts object, containing data needed to be smoothed.

order

Order of centered moving average. If NULL, then the function will try to select order of SMA based on information criteria. See sma for details.

silent

If TRUE, then plot is not produced. Otherwise, there is a plot...

...

Nothing. Needed only for the transition to the new name of variables.

Details

If the order is odd, then the function constructs SMA(order) and shifts it back in time. Otherwise an AR(order+1) model is constructed with the preset parameters:

phi_i = {0.5,1,1,...,0.5} / order

This then corresponds to the centered MA with 0.5 weight for the first observation and 0.5 weight for an additional one. e.g. if this is monthly data and we use order=12, then half of the first January and half of the new one is taken.

This is not a forecasting tool. This is supposed to smooth the time series in order to find trend. So don't expect any forecasts from this function!

Value

Object of class "smooth" is returned. It contains the list of the following values:

model - the name of the estimated model.
timeElapsed - time elapsed for the construction of the model.
order - order of the moving average.
nParam - table with the number of estimated / provided parameters. If a previous model was reused, then its initials are reused and the number of provided parameters will take this into account.
fitted - the fitted values, shifted in time.
forecast - NAs, because this function does not produce forecasts.
residuals - the residuals of the SMA / AR model.
s2 - variance of the residuals (taking degrees of freedom into account) of the SMA / AR model.
y - the original data.
ICs - values of information criteria from the respective SMA or AR model. Includes AIC, AICc, BIC and BICc.
logLik - log-likelihood of the SMA / AR model.
lossValue - Cost function value (for the SMA / AR model).
loss - Type of loss function used in the estimation.

Author(s)

Ivan Svetunkov, ivan@svetunkov.com

References

Svetunkov I. (2023) Smooth forecasting with the smooth package in R. arXiv:2301.01790. doi:10.48550/arXiv.2301.01790.
Svetunkov I. (2015 - Inf) "smooth" package for R - series of posts about the underlying models and how to use them: https://openforecast.org/category/r-en/smooth/.

Examples


# CMA of specific order
ourModel <- cma(rnorm(118,100,3),order=12)

# CMA of arbitrary order
ourModel <- cma(rnorm(118,100,3))

summary(ourModel)

Exponential Smoothing in SSOE state space model

Description

Function constructs ETS model and returns forecast, fitted values, errors and matrix of states. It is a wrapper of adam function.

Usage

es(y, model = "ZXZ", lags = c(frequency(y)), persistence = NULL,
  phi = NULL, initial = c("backcasting", "optimal", "two-stage",
  "complete"), initialSeason = NULL, ic = c("AICc", "AIC", "BIC", "BICc"),
  loss = c("likelihood", "MSE", "MAE", "HAM", "MSEh", "TMSE", "GTMSE",
  "MSCE", "GPL"), h = 0, holdout = FALSE, bounds = c("usual",
  "admissible", "none"), silent = TRUE, xreg = NULL,
  regressors = c("use", "select"), initialX = NULL, ...)

Arguments

y

Vector or ts object, containing data needed to be forecasted.

model

The type of ETS model. The first letter stands for the type of the error term ("A" or "M"), the second (and sometimes the third as well) is for the trend ("N", "A", "Ad", "M" or "Md"), and the last one is for the type of seasonality ("N", "A" or "M"). So, the function accepts words with 3 or 4 characters: ANN, AAN, AAdN, AAA, AAdA, MAdM etc. ZZZ means that the model will be selected based on the chosen information criteria type. Models pool can be restricted with additive only components. This is done via model="XXX". For example, making selection between models with none / additive / damped additive trend component only (i.e. excluding multiplicative trend) can be done with model="ZXZ". Furthermore, selection between multiplicative models (excluding additive components) is regulated using model="YYY". This can be useful for positive data with low values (for example, slow moving products). Finally, if model="CCC", then all the models are estimated and combination of their forecasts using AIC weights is produced (Kolassa, 2011). This can also be regulated. For example, model="CCN" will combine forecasts of all non-seasonal models and model="CXY" will combine forecasts of all the models with non-multiplicative trend and non-additive seasonality with either additive or multiplicative error. Not sure why anyone would need this thing, but it is available.

The parameter model can also be a vector of names of models for a finer tuning (pool of models). For example, model=c("ANN","AAA") will estimate only two models and select the best of them.

Also model can accept a previously estimated ES or ETS (from forecast package) model and use all its parameters.

Keep in mind that model selection with "Z" components uses Branch and Bound algorithm and may skip some models that could have slightly smaller information criteria.

lags

persistence

Persistence vector g, containing smoothing parameters. If NULL, then estimated.

phi

Value of damping parameter. If NULL then it is estimated.

initial

Can be either character or a list, or a vector of initial states. If it is character, then it can be "backcasting", meaning that the initials of dynamic part of the model are produced using backcasting procedure (advised for data with high frequency), or "optimal", meaning that all initial states are optimised, or "two-stage", meaning that optimisation is done after the backcasting, refining the states. In case of backcasting, the parameters of the explanatory variables are optimised. Alternatively, you can set initial="complete" backcasting, which means that all states (including explanatory variables) are initialised via backcasting.

initialSeason

Vector of initial values for seasonal components. If NULL, they are estimated during optimisation.

ic

The information criterion used in the model selection procedure.

loss

There are also available analytical approximations for multistep functions: aMSEh, aTMSE and aGTMSE. These can be useful in cases of small samples.

Finally, just for fun the absolute and half analogues of multistep estimators are available: MAEh, TMAE, GTMAE, MACE, TMAE, HAMh, THAM, GTHAM, CHAM.

h

Length of forecasting horizon.

holdout

If TRUE, holdout sample of size h is taken from the end of the data.

bounds

What type of bounds to use in the model estimation. The first letter can be used instead of the whole word. "usual" implies restrictions on the smoothing parameter, guaranteeing that the exponential smoothing behaves as an averaging model. "admissible" guarantee that the model is stable.

silent

accepts TRUE and FALSE. If FALSE, the function will print its progress and produce a plot at the end.

xreg

regressors

The variable defines what to do with the provided xreg: "use" means that all of the data should be used, while "select" means that a selection using ic should be done.

initialX

The vector of initial parameters for exogenous variables. Ignored if xreg is NULL.

...

Other non-documented parameters. For example FI=TRUE will make the function also produce Fisher Information matrix, which then can be used to calculated variances of smoothing parameters and initial states of the model. Parameters B, lb and ub can be passed via ellipsis as well. In this case they will be used for optimisation. B sets the initial values before the optimisation, lb and ub define lower and upper bounds for the search inside of the specified bounds. These values should have length equal to the number of parameters to estimate. You can also pass two parameters to the optimiser: 1. maxeval - maximum number of evaluations to carry on; 2. xtol_rel - the precision of the optimiser. The default values used in es() are maxeval=500 and xtol_rel=1e-8. You can read more about these parameters in the documentation of nloptr function.

Details

Function estimates ETS in a form of the Single Source of Error state space model of the following type:

y_{t} = o_t (w(v_{t-l}) + h(x_t, a_{t-1}) + r(v_{t-l}) \epsilon_{t})

v_{t} = f(v_{t-l}) + g(v_{t-l}) \epsilon_{t}

a_{t} = F_{X} a_{t-1} + g_{X} \epsilon_{t} / x_{t}

Where o_{t} is the Bernoulli distributed random variable (in case of normal data it equals to 1 for all observations), v_{t} is the state vector and l is the vector of lags, x_t is the vector of exogenous variables. w(.) is the measurement function, r(.) is the error function, f(.) is the transition function, g(.) is the persistence function and h(.) is the explanatory variables function. a_t is the vector of parameters for exogenous variables, F_{X} is the transitionX matrix and g_{X} is the persistenceX matrix. Finally, \epsilon_{t} is the error term.

For the details see Svetunkov (2023) or Hyndman et al.(2008).

For some more information about the model and its implementation, see the vignette: vignette("es","smooth").

Also, there are posts about the functions of the package smooth on the website of Ivan Svetunkov: https://openforecast.org/category/r-en/smooth/ - they explain the underlying models and how to use the functions.

Value

Object of class "adam" is returned. It contains the list of the following values for classical ETS models:

model - type of constructed model.
formula - mathematical formula, describing interactions between components of es() and exogenous variables.
timeElapsed - time elapsed for the construction of the model.
states - matrix of the components of ETS.
persistence - persistence vector. This is the place, where smoothing parameters live.
phi - value of damping parameter.
transition - transition matrix of the model.
measurement - measurement vector of the model.
initialType - type of the initial values used.
initial - initial values of the state vector (non-seasonal).
initialSeason - initial values of the seasonal part of state vector.
nParam - table with the number of estimated / provided parameters. If a previous model was reused, then its initials are reused and the number of provided parameters will take this into account.
fitted - fitted values of ETS. In case of the intermittent model, the fitted are multiplied by the probability of occurrence.
forecast - the point forecast for h steps ahead (by default NA is returned). NOTE that these do not always correspond to the conditional expectations. See ADAM textbook, Section 4.4. for details (https://openforecast.org/adam/ETSTaxonomyMaths.html),
lower - lower bound of prediction interval. When interval="none" then NA is returned.
upper - higher bound of prediction interval. When interval="none" then NA is returned.
residuals - residuals of the estimated model.
errors - trace forecast in-sample errors, returned as a matrix. Only returned when the multistep losses are used and semiparametric interval is needed.
s2 - variance of the residuals (taking degrees of freedom into account). This is an unbiased estimate of variance.
interval - type of interval asked by user.
level - confidence level for interval.
cumulative - whether the produced forecast was cumulative or not.
y - original data.
holdout - holdout part of the original data.
xreg - provided vector or matrix of exogenous variables. If regressors="s", then this value will contain only selected exogenous variables.
initialX - initial values for parameters of exogenous variables.
ICs - values of information criteria of the model. Includes AIC, AICc, BIC and BICc.
logLik - concentrated log-likelihood of the function.
lossValue - loss function value.
loss - type of loss function used in the estimation.
FI - Fisher Information. Equal to NULL if FI=FALSE or when FI is not provided at all.
accuracy - vector of accuracy measures for the holdout sample. In case of non-intermittent data includes: MPE, MAPE, SMAPE, MASE, sMAE, RelMAE, sMSE and Bias coefficient (based on complex numbers). In case of intermittent data the set of errors will be: sMSE, sPIS, sCE (scaled cumulative error) and Bias coefficient. This is available only when holdout=TRUE.
B - the vector of all the estimated parameters.

If combination of forecasts is produced (using model="CCC"), then a shorter list of values is returned:

model,
timeElapsed,
initialType,
fitted,
forecast,
lower,
upper,
residuals,
s2 - variance of additive error of combined one-step-ahead forecasts,
interval,
level,
cumulative,
y,
holdout,
ICs - combined ic,
ICw - ic weights used in the combination,
loss,
xreg,
accuracy.

Author(s)

Ivan Svetunkov, ivan@svetunkov.com

References

Svetunkov, I., 2023. Smooth Forecasting with the Smooth Package in R. arXiv. doi:10.48550/arXiv.2301.01790
Snyder, R. D., 1985. Recursive Estimation of Dynamic Linear Models. Journal of the Royal Statistical Society, Series B (Methodological) 47 (2), 272-276.
Hyndman, R.J., Koehler, A.B., Ord, J.K., and Snyder, R.D. (2008) Forecasting with exponential smoothing: the state space approach, Springer-Verlag. doi:10.1007/978-3-540-71918-2.

Kolassa, S. (2011) Combining exponential smoothing forecasts using Akaike weights. International Journal of Forecasting, 27, pp 238 - 251.
Svetunkov, I., Boylan, J.E., 2023b. Staying Positive: Challenges and Solutions in Using Pure Multiplicative ETS Models. IMA Journal of Management Mathematics. p. 403-425. doi:10.1093/imaman/dpad028

Svetunkov, I. (2023). Forecasting and Analytics with the Augmented Dynamic Adaptive Model (ADAM) (1st ed.). Chapman and Hall/CRC. doi:10.1201/9781003452652, online version: https://openforecast.org/adam/.

Examples


# See how holdout and trace parameters influence the forecast
es(BJsales,model="AAdN",h=8,holdout=FALSE,loss="MSE")
es(AirPassengers,model="MAM",h=18,holdout=TRUE,loss="TMSE")

# Model selection example
es(BJsales,model="ZZN",ic="AIC",h=8,holdout=FALSE,bounds="a")

# Model selection. Compare AICc of these two models:
es(AirPassengers,"ZZZ",h=10,holdout=TRUE)
es(AirPassengers,"MAdM",h=10,holdout=TRUE)

# Model selection, excluding multiplicative trend
es(AirPassengers,model="ZXZ",h=8,holdout=TRUE)

# Combination example
es(BJsales,model="CCN",h=8,holdout=TRUE)

# Model selection using a specified pool of models
ourModel <- es(AirPassengers,model=c("ANN","AAM","AMdA"),h=18)

# Produce forecast and prediction interval
forecast(ourModel, h=18, interval="parametric")

# Semiparametric interval example
forecast(ourModel, h=18, interval="semiparametric")

# This will be the same model as in previous line but estimated on new portion of data
es(BJsales,model=ourModel,h=18,holdout=FALSE)

Forecasting time series using smooth functions

Description

Function produces conditional expectation (point forecasts) and prediction intervals for the estimated model.

Usage

## S3 method for class 'adam'
forecast(object, h = 10, newdata = NULL,
  occurrence = NULL, interval = c("none", "prediction", "confidence",
  "simulated", "approximate", "semiparametric", "nonparametric", "empirical",
  "complete"), level = 0.95, side = c("both", "upper", "lower"),
  cumulative = FALSE, nsim = NULL, scenarios = FALSE, ...)

## S3 method for class 'msdecompose'
forecast(object, h = 10, interval = c("parametric",
  "semiparametric", "nonparametric", "none"), level = 0.95, model = NULL,
  ...)

## S3 method for class 'om'
forecast(object, h = 10, ...)

## S3 method for class 'omg'
forecast(object, h = 10, ...)

Arguments

object

Time series model for which forecasts are required.

h

Forecast horizon.

newdata

The new data needed in order to produce forecasts.

occurrence

The vector containing the future occurrence variable (values in [0,1]), if it is known.

interval

What type of mechanism to use for interval construction. the recommended option is interval="prediction", which will use analytical solutions for pure additive models and simulations for the others. interval="simulated" is the slowest method, but is robust to the type of model. interval="approximate" (aka interval="parametric") uses analytical formulae for conditional h-steps ahead variance, but is approximate for the non-additive error models. interval="semiparametric" relies on the multiple steps ahead forecast error (extracted via rmultistep method) and on the assumed distribution of the error term. interval="nonparametric" uses Taylor & Bunn (1999) approach with quantile regressions. interval="empirical" constructs intervals based on empirical quantiles of multistep forecast errors. interval="complete" will call for reforecast() function and produce interval based on the uncertainty around the parameters of the model. Finally, interval="confidence" tries to generate the confidence intervals for the point forecast based on the reforecast method.

level

Confidence level. Defines width of prediction interval.

side

Defines, whether to provide "both" sides of prediction interval or only "upper", or "lower".

cumulative

If TRUE, then the cumulative forecast and prediction interval are produced instead of the normal ones. This is useful for inventory control systems.

nsim

Number of iterations to do in cases of interval="simulated", interval="prediction" (for mixed and multiplicative model), interval="confidence" and interval="complete". The default value for the prediction / simulated interval is 1000. In case of confidence or complete intervals, this is set to 100.

scenarios

Binary, defining whether to return scenarios produced via simulations or not. Only works if interval="simulated". If TRUE the object will contain scenarios variable.

...

Other arguments passed to reforecast(), or predict() functions.

model

The type of ETS model to fit on the decomposed trend. Only applicable to "msdecompose" class. This is then returned in parameter "esmodel". If NULL, then it will be selected automatically based on the type of the used decomposition (either among pure additive or among pure multiplicative ETS models).

Details

By default the function will generate conditional expectations from the estimated model and will also produce a variety of prediction intervals based on user preferences.

Value

Returns object of class "smooth.forecast", which contains:

model - the estimated model (ES / CES / GUM / SSARIMA).
method - the name of the estimated model (ES / CES / GUM / SSARIMA).
forecast aka mean - point forecasts of the model (conditional mean).
lower - lower bound of prediction interval.
upper - upper bound of prediction interval.
level - confidence level.
interval - binary variable (whether interval were produced or not).
scenarios - in case of forecast.adam() and interval="simulated" returns matrix with scenarios (future paths) that were used in simulations.

Author(s)

Ivan Svetunkov, ivan@svetunkov.com

References

Taylor, J.W. and Bunn, D.W. (1999) A Quantile Regression Approach to Generating Prediction Intervals. Management Science, Vol 45, No 2, pp 225-237.
Lichtendahl Kenneth C., Jr., Grushka-Cockayne Yael, Winkler Robert L., (2013) Is It Better to Average Probabilities or Quantiles? Management Science 59(7):1594-1611. DOI: doi:10.1287/mnsc.1120.1667

Hyndman, R.J., Koehler, A.B., Ord, J.K., and Snyder, R.D. (2008) Forecasting with exponential smoothing: the state space approach, Springer-Verlag.

Examples


ourModel <- es(rnorm(100,0,1), h=10)
forecast(ourModel, h=10, interval="parametric")

Generalised Univariate Model

Description

Function constructs Generalised Univariate Model, estimating matrices F, w, vector g and initial parameters.

Usage

gum(y, orders = c(1, 1), lags = c(1, frequency(y)), type = c("additive",
  "multiplicative"), initial = c("backcasting", "optimal", "two-stage",
  "complete"), persistence = NULL, transition = NULL,
  measurement = rep(1, sum(orders)), loss = c("likelihood", "MSE", "MAE",
  "HAM", "MSEh", "TMSE", "GTMSE", "MSCE", "GPL"), h = 0, holdout = FALSE,
  bounds = c("usual", "admissible", "none"), silent = TRUE, model = NULL,
  xreg = NULL, regressors = c("use", "select", "adapt", "integrate"),
  initialX = NULL, ...)

auto.gum(y, orders = 3, lags = frequency(y), type = c("additive",
  "multiplicative", "select"), initial = c("backcasting", "optimal",
  "two-stage", "complete"), ic = c("AICc", "AIC", "BIC", "BICc"),
  loss = c("likelihood", "MSE", "MAE", "HAM", "MSEh", "TMSE", "GTMSE",
  "MSCE", "GPL"), h = 0, holdout = FALSE, bounds = c("usual",
  "admissible", "none"), silent = TRUE, xreg = NULL,
  regressors = c("use", "select", "adapt", "integrate"), ...)

Arguments

y

Vector or ts object, containing data needed to be forecasted.

orders

Order of the model. Specified as vector of number of states with different lags. For example, orders=c(1,1) means that there are two states: one of the first lag type, the second of the second type. In case of auto.gum(), this parameters is the value of the max order to check.

lags

Defines lags for the corresponding orders. If, for example, orders=c(1,1) and lags are defined as lags=c(1,12), then the model will have two states: the first will have lag 1 and the second will have lag 12. The length of lags must correspond to the length of orders. In case of the auto.gum(), the value of the maximum lag to check. This should usually be a maximum frequency of the data.

type

Type of model. Can either be "additive" or "multiplicative". The latter means that the GUM is fitted on log-transformed data. In case of auto.gum(), can also be "select", implying automatic selection of the type.

initial

Can be either character or a list, or a vector of initial states. If it is character, then it can be "backcasting", meaning that the initials of dynamic part of the model are produced using backcasting procedure (advised for data with high frequency), or "optimal", meaning that all initial states are optimised, or "two-stage", meaning that optimisation is done after the backcasting, refining the states. In case of backcasting, the parameters of the explanatory variables are optimised. Alternatively, you can set initial="complete" backcasting, which means that all states (including explanatory variables) are initialised via backcasting.

persistence

Persistence vector g, containing smoothing parameters. If NULL, then estimated.

transition

Transition matrix F. Can be provided as a vector. Matrix will be formed using the default matrix(transition,nc,nc), where nc is the number of components in the state vector. If NULL, then estimated.

measurement

Measurement vector w. If NULL, then estimated.

loss

There are also available analytical approximations for multistep functions: aMSEh, aTMSE and aGTMSE. These can be useful in cases of small samples.

Finally, just for fun the absolute and half analogues of multistep estimators are available: MAEh, TMAE, GTMAE, MACE, TMAE, HAMh, THAM, GTHAM, CHAM.

h

Length of forecasting horizon.

holdout

If TRUE, holdout sample of size h is taken from the end of the data.

bounds

The type of bounds for the parameters to use in the model estimation. Can be either admissible - guaranteeing the stability of the model, "usual" restrict all the parameters with the (0, 1) region, or none - no restrictions (potentially dangerous).

silent

accepts TRUE and FALSE. If FALSE, the function will print its progress and produce a plot at the end.

model

A previously estimated GUM model, if provided, the function will not estimate anything and will use all its parameters.

xreg

regressors

The variable defines what to do with the provided xreg: "use" means that all of the data should be used, while "select" means that a selection using ic should be done.

initialX

The vector of initial parameters for exogenous variables. Ignored if xreg is NULL.

...

ic

The information criterion used in the model selection procedure.

Details

The function estimates the Single Source of Error state space model of the following type:

y_{t} = w_t' v_{t-l} + \epsilon_{t}

v_{t} = F v_{t-l} + g_{t} \epsilon_{t}

where v_{t} is the state vector (defined using orders) and l is the vector of lags, w_t is the measurement vector (which includes fixed elements and explanatory variables), F is the transition matrix, g_t is the persistence vector (includes explanatory variables as well if provided), finally, \epsilon_{t} is the error term.

For some more information about the model and its implementation, see the vignette: vignette("gum","smooth")

Value

Object of class "adam" is returned with similar elements to the adam function.

Author(s)

Ivan Svetunkov, ivan@svetunkov.com

References

Svetunkov I. (2023) Smooth forecasting with the smooth package in R. arXiv:2301.01790. doi:10.48550/arXiv.2301.01790.
Svetunkov I. (2015 - Inf) "smooth" package for R - series of posts about the underlying models and how to use them: https://openforecast.org/category/r-en/smooth/.

Svetunkov, I., 2023. Smooth Forecasting with the Smooth Package in R. arXiv. doi:10.48550/arXiv.2301.01790
Snyder, R. D., 1985. Recursive Estimation of Dynamic Linear Models. Journal of the Royal Statistical Society, Series B (Methodological) 47 (2), 272-276.
Hyndman, R.J., Koehler, A.B., Ord, J.K., and Snyder, R.D. (2008) Forecasting with exponential smoothing: the state space approach, Springer-Verlag. doi:10.1007/978-3-540-71918-2.

Examples

gum(BJsales, h=8, holdout=TRUE)

ourModel <- gum(rnorm(118,100,3), orders=c(2,1), lags=c(1,4), h=18, holdout=TRUE)

# Redo previous model on a new data and produce prediction interval
gum(rnorm(118,100,3), model=ourModel, h=18)

# Produce something crazy with optimal initials (not recommended)
gum(rnorm(118,100,3), orders=c(1,1,1), lags=c(1,3,5), h=18, holdout=TRUE, initial="o")

# Simpler model estimated using trace forecast error loss function and its analytical analogue
gum(rnorm(118,100,3), orders=c(1), lags=c(1), h=18, holdout=TRUE, bounds="n", loss="TMSE")


x <- rnorm(50,100,3)

# The best GUM model for the data
ourModel <- auto.gum(x, orders=2, lags=4, h=18, holdout=TRUE)

summary(ourModel)

Smooth classes checkers

Description

Functions to check if an object is of the specified class

Usage

is.smooth(x)

is.smoothC(x)

is.msarima(x)

is.om(x)

is.omg(x)

is.smooth.sim(x)

is.smooth.forecast(x)

is.adam(x)

is.adam.sim(x)

is.msdecompose(x)

is.msdecompose.forecast(x)

Arguments

x

The object to check.

Details

The list of functions includes:

is.smooth() tests if the object was produced by a smooth function (e.g. es / ces / ssarima / gum / sma / msarima);
is.msarima() tests if the object was produced by the msarima function;
is.smoothC() tests if the object was produced by a combination function (currently applies only to smoothCombine);
is.smooth.sim() tests if the object was produced by simulate functions (e.g. sim.es / sim.ces / sim.ssarima / sim.sma / sim.gum);
is.smooth.forecast() checks if the forecast was produced from a smooth function using forecast() function.

The list of functions includes:

is.adam() tests if the object was produced by a adam function
is.adam.sim() tests if the object was produced by sim.adam() function (not implemented yet);

Value

TRUE if this is the specified class and FALSE otherwise.

Author(s)

Ivan Svetunkov, ivan@svetunkov.com

Examples


ourModel <- msarima(rnorm(100,100,10))

is.smooth(ourModel)
is.msarima(ourModel)


ourModel <- adam(rnorm(100,100,10))
is.adam(ourModel)

Multiple Seasonal ARIMA

Description

Function constructs Multiple Seasonal State Space ARIMA, estimating AR, MA terms and initial states. It is a wrapper of adam function.

Usage

msarima(y, orders = list(ar = c(0), i = c(1), ma = c(1)), lags = c(1),
  constant = FALSE, arma = NULL, model = NULL,
  initial = c("backcasting", "optimal", "two-stage", "complete"),
  ic = c("AICc", "AIC", "BIC", "BICc"), loss = c("likelihood", "MSE",
  "MAE", "HAM", "MSEh", "TMSE", "GTMSE", "MSCE", "GPL"), h = 0,
  holdout = FALSE, bounds = c("usual", "admissible", "none"),
  silent = TRUE, xreg = NULL, regressors = c("use", "select", "adapt"),
  initialX = NULL, ...)

auto.msarima(y, orders = list(ar = c(3, 3), i = c(2, 1), ma = c(3, 3)),
  lags = c(1, frequency(y)), initial = c("backcasting", "optimal",
  "two-stage", "complete"), ic = c("AICc", "AIC", "BIC", "BICc"),
  loss = c("likelihood", "MSE", "MAE", "HAM", "MSEh", "TMSE", "GTMSE",
  "MSCE", "GPL"), h = 0, holdout = FALSE, bounds = c("usual",
  "admissible", "none"), silent = TRUE, xreg = NULL,
  regressors = c("use", "select", "adapt"), initialX = NULL, ...)

Arguments

y

Vector or ts object, containing data needed to be forecasted.

orders

List of orders, containing vector variables ar, i and ma. Example: orders=list(ar=c(1,2),i=c(1),ma=c(1,1,1)). If a variable is not provided in the list, then it is assumed to be equal to zero. At least one variable should have the same length as lags. Another option is to specify orders as a vector of a form orders=c(p,d,q). The non-seasonal ARIMA(p,d,q) is constructed in this case. For auto.msarima this is the list of maximum orders to check, containing vector variables ar, i and ma. If a variable is not provided in the list, then it is assumed to be equal to zero. At least one variable should have the same length as lags.

lags

Defines lags for the corresponding orders (see examples above). The length of lags must correspond to the length of either ar, i or ma in orders variable. There is no restrictions on the length of lags vector. It is recommended to order lags ascending. The orders are set by a user. If you want the automatic order selection, then use auto.msarima function instead.

constant

If TRUE, constant term is included in the model. Can also be a number (constant value). For auto.msarima, if NULL, then the function will check if constant is needed.

arma

model

Previously estimated MSARIMA model.

initial

Can be either character or a vector of initial states. If it is character, then it can be "optimal", meaning that all initial states are optimised, or "backcasting", meaning that the initials of dynamic part of the model are produced using backcasting procedure (advised for data with high frequency). In the latter case, the parameters of the explanatory variables are optimised. This is recommended for ARIMAX model. Alternatively, you can set initial="complete" backcasting, which means that all states (including explanatory variables) are initialised via backcasting.

ic

The information criterion used in the model selection procedure.

loss

There are also available analytical approximations for multistep functions: aMSEh, aTMSE and aGTMSE. These can be useful in cases of small samples.

Finally, just for fun the absolute and half analogues of multistep estimators are available: MAEh, TMAE, GTMAE, MACE, TMAE, HAMh, THAM, GTHAM, CHAM.

h

Length of forecasting horizon.

holdout

If TRUE, holdout sample of size h is taken from the end of the data.

bounds

What type of bounds to use in the model estimation. The first letter can be used instead of the whole word. "admissible" guarantee that the model is stable. "usual" are not supported due to restrictions in adam().

silent

accepts TRUE and FALSE. If FALSE, the function will print its progress and produce a plot at the end.

xreg

regressors

The variable defines what to do with the provided xreg: "use" means that all of the data should be used, while "select" means that a selection using ic should be done.

initialX

The vector of initial parameters for exogenous variables. Ignored if xreg is NULL.

...

Other non-documented parameters. see adam for details.

FI=TRUE will make the function produce Fisher Information matrix, which then can be used to calculated variances of parameters of the model.

Details

The model, implemented in this function differs from the one in ssarima function (Svetunkov & Boylan, 2019), but it is more efficient and better fitting the data (which might be a limitation).

The basic ARIMA(p,d,q) used in the function has the following form:

(1 - B)^d (1 - a_1 B - a_2 B^2 - ... - a_p B^p) y_[t] = (1 + b_1 B + b_2 B^2 + ... + b_q B^q) \epsilon_[t] + c

where y_[t] is the actual values, \epsilon_[t] is the error term, a_i, b_j are the parameters for AR and MA respectively and c is the constant. In case of non-zero differences c acts as drift.

This model is then transformed into ARIMA in the Single Source of Error State space form (based by Snyder, 1985, but in a slightly different formulation):

y_{t} = o_{t} (w' v_{t-l} + x_t a_{t-1} + \epsilon_{t})

v_{t} = F v_{t-l} + g \epsilon_{t}

a_{t} = F_{X} a_{t-1} + g_{X} \epsilon_{t} / x_{t}

Where o_{t} is the Bernoulli distributed random variable (in case of normal data equal to 1), v_{t} is the state vector (defined based on orders) and l is the vector of lags, x_t is the vector of exogenous parameters. w is the measurement vector, F is the transition matrix, g is the persistence vector, a_t is the vector of parameters for exogenous variables, F_{X} is the transitionX matrix and g_{X} is the persistenceX matrix. The main difference from ssarima function is that this implementation skips zero polynomials, substantially decreasing the dimension of the transition matrix. As a result, this function works faster than ssarima on high frequency data, and it is more accurate.

Due to the flexibility of the model, multiple seasonalities can be used. For example, something crazy like this can be constructed: SARIMA(1,1,1)(0,1,1)[24](2,0,1)[24*7](0,0,1)[24*30], but the estimation may take some time... Still this should be estimated in finite time (not like with ssarima).

The auto.msarima function constructs several ARIMA models in Single Source of Error state space form based on adam function (see adam documentation) and selects the best one based on the selected information criterion.

For some additional details see the vignettes: vignette("adam","smooth") and vignette("ssarima","smooth")

Value

Object of class "adam" is returned. It contains the list of the following values:

model - the name of the estimated model.
timeElapsed - time elapsed for the construction of the model.
states - the matrix of the fuzzy components of msarima, where rows correspond to time and cols to states.
transition - matrix F.
persistence - the persistence vector. This is the place, where smoothing parameters live.
measurement - measurement vector of the model.
arma - list of AR/MA parameters
constant - the value of the constant term.
initialType - Type of the initial values used.
initial - the initial values of the state vector (extracted from states).
nParam - table with the number of estimated / provided parameters. If a previous model was reused, then its initials are reused and the number of provided parameters will take this into account.
fitted - the fitted values.
forecast - the point forecast.
lower - the lower bound of prediction interval. When interval="none" then NA is returned.
upper - the higher bound of prediction interval. When interval="none" then NA is returned.
residuals - the residuals of the estimated model.
errors - The matrix of 1 to h steps ahead errors. Only returned when the multistep losses are used and semiparametric interval is needed.
s2 - variance of the residuals (taking degrees of freedom into account).
interval - type of interval asked by user.
level - confidence level for interval.
cumulative - whether the produced forecast was cumulative or not.
y - the original data.
holdout - the holdout part of the original data.
xreg - provided vector or matrix of exogenous variables. If regressors="s", then this value will contain only selected exogenous variables.
initialX - initial values for parameters of exogenous variables.
ICs - values of information criteria of the model. Includes AIC, AICc, BIC and BICc.
logLik - log-likelihood of the function.
lossValue - Cost function value.
loss - Type of loss function used in the estimation.
FI - Fisher Information. Equal to NULL if FI=FALSE or when FI is not provided at all.
accuracy - vector of accuracy measures for the holdout sample. In case of non-intermittent data includes: MPE, MAPE, SMAPE, MASE, sMAE, RelMAE, sMSE and Bias coefficient (based on complex numbers). In case of intermittent data the set of errors will be: sMSE, sPIS, sCE (scaled cumulative error) and Bias coefficient. This is available only when holdout=TRUE.
B - the vector of all the estimated parameters.

Author(s)

Ivan Svetunkov, ivan@svetunkov.com

References

Svetunkov, I., 2023. Smooth Forecasting with the Smooth Package in R. arXiv. doi:10.48550/arXiv.2301.01790
Snyder, R. D., 1985. Recursive Estimation of Dynamic Linear Models. Journal of the Royal Statistical Society, Series B (Methodological) 47 (2), 272-276.
Hyndman, R.J., Koehler, A.B., Ord, J.K., and Snyder, R.D. (2008) Forecasting with exponential smoothing: the state space approach, Springer-Verlag. doi:10.1007/978-3-540-71918-2.

Svetunkov, I., & Boylan, J. E. (2019). State-space ARIMA for supply-chain forecasting. International Journal of Production Research, 0(0), 1–10. doi:10.1080/00207543.2019.1600764

Examples


x <- rnorm(118,100,3)

# The simple call of ARIMA(1,1,1):
ourModel <- msarima(x,orders=c(1,1,1),lags=1,h=18,holdout=TRUE)

# Example of SARIMA(2,0,0)(1,0,0)[4]
msarima(x,orders=list(ar=c(2,1)),lags=c(1,4),h=18,holdout=TRUE)

# SARIMA of a peculiar order on AirPassengers data
ourModel <- msarima(AirPassengers,orders=list(ar=c(1,0,3),i=c(1,0,1),ma=c(0,1,2)),
                    lags=c(1,6,12),h=10,holdout=TRUE)

# ARIMA(1,1,1) with Mean Squared Trace Forecast Error
msarima(x,orders=c(1,1,1),lags=1,h=18,holdout=TRUE,loss="TMSE")

plot(forecast(ourModel, h=18, interval="approximate"))


x <- rnorm(118,100,3)
# The best ARIMA for the data
ourModel <- auto.msarima(x,orders=list(ar=c(2,1),i=c(1,1),ma=c(2,1)),lags=c(1,12),
                     h=18,holdout=TRUE)

# The other one using optimised states
auto.msarima(x,orders=list(ar=c(3,2),i=c(2,1),ma=c(3,2)),lags=c(1,12),
                       h=18,holdout=TRUE)

# And now combined ARIMA
auto.msarima(x,orders=list(ar=c(3,2),i=c(2,1),ma=c(3,2)),lags=c(1,12),
                       combine=TRUE,h=18,holdout=TRUE)

plot(forecast(ourModel, h=18, interval="simulated"))

Multiple seasonal classical decomposition

Description

Function decomposes multiple seasonal time series into components using the principles of classical decomposition.

Usage

msdecompose(y, lags = c(12), type = c("additive", "multiplicative"),
  smoother = c("lowess", "ma", "supsmu", "global"), ...)

Arguments

y

Vector or ts object, containing data needed to be smoothed.

lags

Vector of lags, corresponding to the frequencies in the data.

type

The type of decomposition. If "multiplicative" is selected, then the logarithm of data is taken prior to the decomposition.

smoother

The type of function used in the smoother of the data to extract the trend and in seasonality smoothing. smoother="ma" relies on the centred moving average and will result in the classical decomposition. smoother="lowess" will use lowess, resulting in a decomposition similar to the STL (stl). Finally, smoother="supsmu" will use the Friedman's super smoother via supsmu. smoother="global" will fit the global model to the data.

...

Other parameters passed to smoothers. Only works with lowess/supsmu.

Details

The function applies centred moving averages based on filter function and order specified in lags variable in order to smooth the original series and obtain level, trend and seasonal components of the series.

Value

The object of the class "msdecompose" is return, containing:

y - the original time series.
initial - the named list of the estimates of the initial level, trend and seasonality.
trend - the long term trend in the data.
seasonal - the list of seasonal parameters.
lags - the provided lags.
type - the selected type of the decomposition.
yName - the name of the provided data.

Author(s)

Ivan Svetunkov, ivan@svetunkov.com

References

Svetunkov I. (2023) Smooth forecasting with the smooth package in R. arXiv:2301.01790. doi:10.48550/arXiv.2301.01790.
Svetunkov I. (2015 - Inf) "smooth" package for R - series of posts about the underlying models and how to use them: https://openforecast.org/category/r-en/smooth/.

Examples


# Decomposition of multiple frequency data
## Not run: ourModel <- msdecompose(forecast::taylor, lags=c(48,336), type="m")
ourModel <- msdecompose(AirPassengers, lags=c(12), type="m")

plot(ourModel)
plot(forecast(ourModel, model="AAN", h=12))

Function returns the multiple steps ahead covariance matrix of forecast errors

Description

This function extracts covariance matrix of 1 to h steps ahead forecast errors for adam(), ssarima(), gum(), sma(), es() and ces() models.

Usage

multicov(object, type = c("analytical", "empirical", "simulated"), h = 10,
  nsim = 1000, ...)

## S3 method for class 'smooth'
multicov(object, type = c("analytical", "empirical",
  "simulated"), h = 10, nsim = 1000, ...)

Arguments

object

Model estimated using one of the functions of smooth package.

type

What method to use in order to produce covariance matrix:

analytical - based on the state space structure of the model and the one-step-ahead forecast error. This works for pure additive and pure multiplicative models. The values for the mixed models might be off.
empirical - based on the in-sample 1 to h steps ahead forecast errors (works fine on larger samples);
simulated - the data is simulated from the estimated model, then the same model is applied to it and then the empirical 1 to h steps ahead forecast errors are produced;

h

Forecast horizon to use in the calculations.

nsim

Number of iterations to produce in the simulation. Only needed if type="simulated"

...

Other parameters passed to simulate function (if type="simulated" is used). These are obs and seed. By default obs=1000. This approach increases the accuracy of covariance matrix on small samples and intermittent data;

Details

The function returns either scalar (if it is a non-smooth model) or the matrix of (h x h) size with variances and covariances of 1 to h steps ahead forecast errors.

Value

Scalar in cases of non-smooth functions. (h x h) matrix otherwise.

Author(s)

Ivan Svetunkov, ivan@svetunkov.com

Examples


x <- rnorm(100,0,1)

# A simple example with a 5x5 covariance matrix
ourModel <- ces(x, h=5)
multicov(ourModel)

Occurrence ETS model

Description

Wrapper of om that fits an occurrence (probability) ETS model to a univariate intermittent time series. ARIMA components, ARMA parameters, and explanatory-variable formulas are disabled (orders, arma, formula are all forced to NULL); use om directly if those features are needed.

Usage

oes(y, model = "MNN", lags = c(frequency(y)), persistence = NULL,
  phi = NULL, initial = c("backcasting", "optimal", "two-stage",
  "complete"), occurrence = c("auto", "fixed", "odds-ratio",
  "inverse-odds-ratio", "direct", "general"), ic = c("AICc", "AIC", "BIC",
  "BICc"), h = 0, holdout = FALSE, bounds = c("usual", "admissible",
  "none"), ets = c("conventional", "adam"), xreg = NULL,
  regressors = c("use", "select"), silent = TRUE, ...)

Arguments

y

Univariate numeric vector or time-series. Non-binary input is binarised (any non-zero value becomes 1).

model

Three-letter ETS specification (e.g. "MNN", "AAdN"). Wildcards "Z"/"X"/"Y" trigger automatic selection.

lags

Vector of seasonal lags. Defaults to frequency(y).

persistence

Optional persistence (smoothing) parameter vector.

phi

Optional damping parameter. Only used for damped-trend models.

initial

Initialisation method: "backcasting", "optimal", "two-stage", or "complete".

occurrence

Type of link function mapping the state to a probability: "fixed", "odds-ratio", "inverse-odds-ratio", or "direct".

ic

Information criterion for model selection.

h

Forecast horizon.

holdout

If TRUE, a holdout sample of size h is withheld.

bounds

Parameter bounds type.

ets

Type of ETS model: "conventional" or "adam".

xreg

Optional numeric vector or matrix of exogenous regressors, aligned with y. Merged with y into a multi-column data matrix before being passed to om.

regressors

How to handle regressors: "use" or "select".

silent

If TRUE, suppresses output and plot.

...

Additional arguments forwarded to om (e.g. maxeval, xtol_rel, algorithm, print_level).

Details

This is the analogue of es for occurrence models: a lightweight ETS-only entry point that delegates the heavy lifting to om.

Value

An object of class c("om","adam","smooth").

Examples

set.seed(42)
y <- rbinom(120, 1, 0.6)
m <- oes(y, model="MNN", occurrence="odds-ratio")
forecast(m, h=12)

Occurrence ETS general model

Description

Wrapper of omg that fits a general occurrence (probability) model — two parallel ETS sub-models A and B combined via a Beta-distribution link — to a univariate intermittent time series. ARIMA components and regression formulas are disabled; use omg directly if those are needed.

Usage

oesg(y, modelA = "MNN", modelB = modelA, lags = c(frequency(y)),
  persistenceA = NULL, persistenceB = persistenceA, phiA = NULL,
  phiB = phiA, initial = c("backcasting", "optimal", "two-stage",
  "complete"), ic = c("AICc", "AIC", "BIC", "BICc"), h = 0,
  holdout = FALSE, bounds = c("usual", "admissible", "none"),
  etsA = c("conventional", "adam"), etsB = etsA, xregA = NULL,
  xregB = NULL, regressorsA = c("use", "select"),
  regressorsB = regressorsA, silent = TRUE, ...)

Arguments

y

Univariate numeric vector or time series. Non-binary input is binarised internally by omg.

modelA

Three-letter ETS specification for sub-model A.

modelB

Three-letter ETS specification for sub-model B. Defaults to modelA.

lags

Vector of seasonal lags. Defaults to frequency(y).

persistenceA

Optional persistence vector for sub-model A.

persistenceB

Optional persistence vector for sub-model B. Defaults to persistenceA.

phiA

Optional damping parameter for sub-model A.

phiB

Optional damping parameter for sub-model B. Defaults to phiA.

initial

Initialisation method passed to both sub-models.

ic

Information criterion for model selection.

h

Forecast horizon.

holdout

If TRUE, a holdout of size h is withheld.

bounds

Parameter bounds type.

etsA

ETS type for sub-model A: "conventional" or "adam".

etsB

ETS type for sub-model B. Defaults to etsA.

xregA

Optional numeric vector or matrix of exogenous regressors for sub-model A, aligned with y.

xregB

Optional numeric vector or matrix of exogenous regressors for sub-model B, aligned with y.

regressorsA

How to handle xregA: "use" or "select".

regressorsB

How to handle xregB. Defaults to regressorsA.

silent

If TRUE, suppresses output and plot.

...

Additional arguments forwarded to omg.

Value

An object of class c("omg","om","smooth","occurrence").

Examples

set.seed(42)
y <- rbinom(120, 1, 0.3)
m <- oesg(y, modelA="MNN", modelB="MNN")
forecast(m, h=12)

Occurrence Model

Description

Function returns the occurrence part of the ADAM model with the specified probability update and model types.

Usage

om(data, model = "ZXZ", lags = c(frequency(data)), orders = list(ar =
  c(0), i = c(0), ma = c(0), select = FALSE), constant = FALSE,
  formula = NULL, regressors = c("use", "select", "adapt"),
  occurrence = c("auto", "fixed", "odds-ratio", "inverse-odds-ratio",
  "direct", "general"), loss = c("likelihood", "MSE", "MAE", "HAM", "LASSO",
  "RIDGE"), h = 0, holdout = FALSE, persistence = NULL, phi = NULL,
  initial = c("backcasting", "optimal", "two-stage", "complete"),
  arma = NULL, ic = c("AICc", "AIC", "BIC", "BICc"), bounds = c("usual",
  "admissible", "none"), ets = c("conventional", "adam"), silent = TRUE,
  ...)

Arguments

data

Numeric vector, time series, or data frame. Non-binary input is automatically binarised: any non-zero value becomes 1.

model

Three-letter ETS specification such as "MNN" or "AAN". Automatic selection with "Z" / "X" / "Y" wildcards is supported.

lags

Vector of seasonal lags. Defaults to frequency(data).

orders

ARIMA orders list: list(ar, i, ma, select).

constant

Logical; whether to include a constant term.

formula

Optional formula for external regressors.

regressors

How to handle regressors: "use", "select", or "adapt".

occurrence

Type of link function mapping state to probability: "fixed" (constant), "odds-ratio", "inverse-odds-ratio", or "direct".

loss

Loss function: "likelihood" (Bernoulli) or "MSE".

h

Forecast horizon.

holdout

If TRUE, a holdout sample of size h is withheld.

persistence

Optional persistence (smoothing) parameter vector.

phi

Optional damping parameter.

initial

Initialisation method: "backcasting", "optimal", "two-stage", or "complete".

arma

Optional fixed ARMA parameters.

ic

Information criterion for model selection.

bounds

Parameter bounds type.

ets

Type of ETS model: "conventional" or "adam".

silent

If TRUE, suppresses output and plot.

...

Additional arguments passed to the optimiser (maxeval, xtol_rel, algorithm, print_level).

Details

The function estimates probability of demand occurrence, using the selected ADAM state space model. It supports ETS, ARIMA and explanatory variables, also allowing to have multiple frequencies and doing variables selection. It is an ADAM analogue for the binary occurrence variable modelling.

For the details about the model and its implementation, see the respective vignette: vignette("om","smooth")

Value

An object of class c("om","adam","smooth").

Examples

set.seed(42)
y <- rbinom(120, 1, 0.6)
m <- om(y, model="MNN", occurrence="odds-ratio")
forecast(m, h=12)

General occurrence model

Description

Fits two parallel ETS occurrence models (A: odds-ratio, B: inverse-odds-ratio) jointly using a shared Bernoulli log-likelihood. The combined probability at each time point is p_t = p_{At} / (p_{At} + p_{Bt}).

Usage

omg(data, modelA = "MNN", modelB = modelA, ordersA = list(ar = c(0), i =
  c(0), ma = c(0), select = FALSE), ordersB = ordersA, constantA = FALSE,
  constantB = constantA, formulaA = NULL, formulaB = formulaA,
  regressorsA = c("use", "select", "adapt"), regressorsB = regressorsA,
  persistenceA = NULL, persistenceB = persistenceA, phiA = NULL,
  phiB = phiA, armaA = NULL, armaB = armaA, etsA = c("conventional",
  "adam"), etsB = etsA, lags = c(frequency(data)), h = 0,
  holdout = FALSE, initial = c("backcasting", "optimal", "two-stage",
  "complete"), loss = c("likelihood", "MSE", "MAE", "HAM", "LASSO", "RIDGE"),
  ic = c("AICc", "AIC", "BIC", "BICc"), bounds = c("usual", "admissible",
  "none"), model = NULL, silent = TRUE, ...)

Arguments

data

Binary time series (0/1), vector or data frame.

modelA

ETS model string for model A (default "MNN").

modelB

ETS model string for model B (default "MNN").

ordersA

ARIMA orders list for model A.

ordersB

ARIMA orders list for model B.

constantA

Logical, include constant in model A.

constantB

Logical, include constant in model B.

formulaA

Formula for exogenous variables in model A.

formulaB

Formula for exogenous variables in model B.

regressorsA

How to handle regressors in model A.

regressorsB

How to handle regressors in model B.

persistenceA

Persistence vector for model A.

persistenceB

Persistence vector for model B.

phiA

Damping parameter for model A.

phiB

Damping parameter for model B.

armaA

ARMA parameters for model A.

armaB

ARMA parameters for model B.

etsA

ETS variant for model A ("conventional" or "adam").

etsB

ETS variant for model B.

lags

Seasonal lags (shared).

h

Forecast horizon.

holdout

If TRUE, hold out the last h observations.

initial

Initialisation method (shared).

loss

Loss function (shared).

ic

Information criterion (shared).

bounds

Parameter bounds type (shared).

model

An already-fitted omg object. When supplied, the per-side parameters are lifted from model$modelA and model$modelB and no estimation is performed; passing FI=TRUE alongside computes the observed Fisher information over the joint parameter vector (the path used by vcov.omg).

silent

If TRUE, suppress output.

...

Additional arguments passed to the optimiser.

Value

An object of class c("omg","om","smooth").

Examples

set.seed(41)
y <- rpois(100, 0.5)
m <- omg(y)
forecast(m, h=10)

Functions that extract values from the fitted model

Description

These functions allow extracting orders and lags for ssarima(), gum() and sma(), type of model from es() and ces() and name of model.

Usage

orders(object, ...)

lags(object, ...)

modelName(object, ...)

modelType(object, ...)

Arguments

object

Model estimated using one of the functions of smooth package.

...

Currently nothing is accepted via ellipsis.

Details

orders() and lags() are useful only for SSARIMA, GUM and SMA. They return NA for other functions. This can also be applied to arima(), Arima() and auto.arima() functions from stats and forecast packages. modelType() is useful only for ETS and CES. They return NA for other functions. This can also be applied to ets() function from forecast package. errorType extracts the type of error from the model (either additive or multiplicative). Finally, modelName returns the name of the fitted model. For example, "ARIMA(0,1,1)". This is purely descriptive and can also be applied to non-smooth classes, so that, for example, you can easily extract the name of the fitted AR model from ar() function from stats package.

Value

Either vector, scalar or list with values is returned. orders() in case of ssarima returns list of values:

ar - AR orders.
i - I orders.
ma - MA orders.

lags() returns the vector of lags of the model. All the other functions return strings of character.

Author(s)

Ivan Svetunkov, ivan@svetunkov.com

Examples


x <- rnorm(100,0,1)

# Just as example. orders and lags do not return anything for ces() and es(). But modelType() does.
ourModel <- ces(x, h=10)
orders(ourModel)
lags(ourModel)
modelType(ourModel)
modelName(ourModel)

# And as another example it does the opposite for gum() and ssarima()
ourModel <- gum(x, h=10, orders=c(1,1), lags=c(1,4))
orders(ourModel)
lags(ourModel)
modelType(ourModel)
modelName(ourModel)

# Finally these values can be used for simulate functions or original functions.
ourModel <- auto.ssarima(x)
ssarima(x, orders=orders(ourModel), lags=lags(ourModel), constant=ourModel$constant)
sim.ssarima(orders=orders(ourModel), lags=lags(ourModel), constant=ourModel$constant)

ourModel <- es(x)
es(x, model=modelType(ourModel))
sim.es(model=modelType(ourModel))

Plots for the fit and states

Description

The function produces diagnostics plots for a smooth model

Usage

## S3 method for class 'adam'
plot(x, which = c(1, 2, 4, 6), level = 0.95,
  legend = FALSE, ask = prod(par("mfcol")) < length(which) &&
  dev.interactive(), lowess = TRUE, ...)

## S3 method for class 'smooth'
plot(x, which = c(1, 2, 4, 6), level = 0.95,
  legend = FALSE, ask = prod(par("mfcol")) < length(which) &&
  dev.interactive(), lowess = TRUE, ...)

## S3 method for class 'msdecompose'
plot(x, which = c(1, 2, 4, 6), level = 0.95,
  legend = FALSE, ask = prod(par("mfcol")) < length(which) &&
  dev.interactive(), lowess = TRUE, ...)

Arguments

x

Estimated smooth model.

which

Which of the plots to produce. The possible options (see details for explanations):

Actuals vs Fitted values;
Standardised residuals vs Fitted;
Studentised residuals vs Fitted;
Absolute residuals vs Fitted;
Squared residuals vs Fitted;
Q-Q plot with the specified distribution;
Fitted over time;
Standardised residuals vs Time;
Studentised residuals vs Time;
ACF of the residuals;
PACF of the residuals;
Plot of states of the model;
Absolute standardised residuals vs Fitted;
Squared standardised residuals vs Fitted;
ACF of the squared residuals;
PACF of the squared residuals.

level

Confidence level. Defines width of confidence interval. Used in plots (2), (3), (7), (8), (9), (10) and (11).

legend

If TRUE, then the legend is produced on plots (2), (3) and (7).

ask

Logical; if TRUE, the user is asked to press Enter before each plot.

lowess

Logical; if TRUE, LOWESS lines are drawn on scatterplots, see lowess.

...

The parameters passed to the plot functions. Recommended to use with separate plots.

Details

The list of produced plots includes:

Actuals vs Fitted values. Allows analysing, whether there are any issues in the fit. Does the variability of actuals increase with the increase of fitted values? Is the relation well captured? They grey line on the plot corresponds to the perfect fit of the model.
Standardised residuals vs Fitted. Plots the points and the confidence bounds (red lines) for the specified confidence level. Useful for the analysis of outliers;
Studentised residuals vs Fitted. This is similar to the previous plot, but with the residuals divided by the scales with the leave-one-out approach. Should be more sensitive to outliers;
Absolute residuals vs Fitted. Useful for the analysis of heteroscedasticity;
Squared residuals vs Fitted - similar to (3), but with squared values;
Q-Q plot with the specified distribution. Can be used in order to see if the residuals follow the assumed distribution. The type of distribution depends on the one used in the estimation (see distribution parameter in alm);
ACF of the residuals. Are the residuals autocorrelated? See acf for details;
Fitted over time. Plots actuals (black line), fitted values (purple line), point forecast (blue line) and prediction interval (grey lines). Can be used in order to make sure that the model did not miss any important events over time;
Standardised residuals vs Time. Useful if you want to see, if there is autocorrelation or if there is heteroscedasticity in time. This also shows, when the outliers happen;
Studentised residuals vs Time. Similar to previous, but with studentised residuals;
PACF of the residuals. No, really, are they autocorrelated? See pacf function from stats package for details;
Plot of the states of the model. It is not recommended to produce this plot together with the others, because there might be several states, which would cause the creation of a different canvas. In case of "msdecompose", this will produce the decomposition of the series into states on a different canvas;
Absolute standardised residuals vs Fitted. Similar to the previous, but with absolute values. This is more relevant to the models where scale is calculated as an absolute value of something (e.g. Laplace);
Squared standardised residuals vs Fitted. This is an additional plot needed to diagnose heteroscedasticity in a model with varying scale. The variance on this plot will be constant if the adequate model for scale was constructed. This is more appropriate for normal and the related distributions.

Which of the plots to produce, is specified via the which parameter.

Value

The function produces the number of plots, specified in the parameter which.

Author(s)

Ivan Svetunkov, ivan@svetunkov.com

Examples


ourModel <- es(c(rnorm(50,100,10),rnorm(50,120,10)), "ANN", h=10)
par(mfcol=c(3,4))
plot(ourModel, c(1:11))
plot(ourModel, 12)

Prediction Likelihood Score

Description

Function estimates Prediction Likelihood Score for the provided model

Usage

pls(object, holdout = NULL, ...)

## S3 method for class 'smooth'
pls(object, holdout = NULL, ...)

Arguments

object

The model estimated using smooth functions. This thing also accepts other models (e.g. estimated using functions from forecast package), but may not always work properly with them.

holdout

The values for the holdout part of the sample. If the model was fitted on the data with the holdout=TRUE, then the parameter is not needed.

...

Parameters passed to multicov function. The function is called in order to get the covariance matrix of 1 to h steps ahead forecast errors.

Details

Prediction likelihood score (PLS) is based on either normal or log-normal distribution of errors. This is extracted from the provided model. The likelihood based on the distribution of 1 to h steps ahead forecast errors is used in the process.

Value

A value of the log-likelihood.

Author(s)

Ivan Svetunkov, ivan@svetunkov.com

References

distribution. IEEE Signal Processing Letters. 13 (5): 300-303. doi:10.1109/LSP.2006.870353 - this is not yet used in the function.

Snyder, R. D., Ord, J. K., Beaumont, A., 2012. Forecasting the intermittent demand for slow-moving inventories: A modelling approach. International Journal of Forecasting 28 (2), 485-496.
Kolassa, S., 2016. Evaluating predictive count data distributions in retail sales forecasting. International Journal of Forecasting 32 (3), 788-803..

Examples


# Generate data, apply es() with the holdout parameter and calculate PLS
x <- rnorm(100,0,1)
ourModel <- es(x, h=10, holdout=TRUE)
pls(ourModel, type="a")
pls(ourModel, type="e")
pls(ourModel, type="s", obs=100, nsim=100)

Reapply the model with randomly generated initial parameters and produce forecasts

Description

reapply function generates the parameters based on the values in the provided object and then reapplies the same model with those parameters to the data, getting the fitted paths and updated states. reforecast function uses those values in order to produce forecasts for the h steps ahead.

Usage

reapply(object, nsim = 1000, bootstrap = FALSE, heuristics = NULL, ...)

reforecast(object, h = 10, newdata = NULL, occurrence = NULL,
  interval = c("prediction", "confidence", "none"), level = 0.95,
  side = c("both", "upper", "lower"), cumulative = FALSE, nsim = 100,
  ...)

Arguments

object

Model estimated using one of the functions of smooth package.

nsim

Number of paths to generate (number of simulations to do).

bootstrap

The logical, which determines, whether to use bootstrap for the covariance matrix of parameters or not.

heuristics

The value for proportion to use for heuristic estimation of the standard deviation of parameters. If NULL, it is not used.

...

Other parameters passed to reapply() and mean() functions in case of reforecast (trim parameter in mean() is set to 0.01 by default) and to vcov in case of reapply.

h

Forecast horizon.

newdata

The new data needed in order to produce forecasts.

occurrence

The vector containing the future occurrence variable (values in [0,1]), if it is known.

interval

What type of mechanism to use for interval construction. The options include interval="none", interval="prediction" (prediction intervals) and interval="confidence" (intervals for the point forecast). The other options are not supported and do not make much sense for the refitted model.

level

Confidence level. Defines width of prediction interval.

side

Defines, whether to provide "both" sides of prediction interval or only "upper", or "lower".

cumulative

If TRUE, then the cumulative forecast and prediction interval are produced instead of the normal ones. This is useful for inventory control systems.

Details

The main motivation of the function is to take the randomness due to the in-sample estimation of parameters into account when fitting the model and to propagate this randomness to the forecasts. The methods can be considered as a special case of recursive bootstrap.

Value

reapply() returns object of the class "reapply", which contains:

timeElapsed - Time elapsed for the code execution;
y - The actual values;
states - The array of states of the model;
refitted - The matrix with fitted values, where columns correspond to different paths;
fitted - The vector of fitted values (conditional mean);
model - The name of the constructed model;
transition - The array of transition matrices;
measurement - The array of measurement matrices;
persistence - The matrix of persistence vectors (paths in columns);
profile - The array of profiles obtained by the end of each fit.

reforecast() returns the object of the class forecast.smooth, which contains in addition to the standard list the variable paths - all simulated trajectories with h in rows, simulated future paths for each state in columns and different states (obtained from reapply() function) in the third dimension.

Author(s)

Ivan Svetunkov, ivan@svetunkov.com

Examples


x <- rnorm(100,0,1)

# Just as example. orders and lags do not return anything for ces() and es(). But modelType() does.
ourModel <- adam(x, "ANN")
refittedModel <- reapply(ourModel, nsim=50)
plot(refittedModel)

ourForecast <- reforecast(ourModel, nsim=50)

Objects exported from other packages

Description

These objects are imported from other packages. Follow the links below to see their documentation.

generics: accuracy()
xtable: xtable()

Multiple steps ahead forecast errors

Description

The function extracts 1 to h steps ahead forecast errors from the model.

Usage

rmultistep(object, h = 10, error = c("default", "additive",
  "multiplicative"), ...)

Arguments

object

Model estimated using one of the forecasting functions.

h

The forecasting horizon to use.

error

Defines what type of error to return. "default" means returning the one used in the original model. "additive" is to return e_t = y_t - mu_t. Finally, "multiplicative" will return e_t = (y_t - mu_t) / mu_t.

...

Currently nothing is accepted via ellipsis.

Details

The errors correspond to the error term epsilon_t in the ETS models. Don't forget that different models make different assumptions about epsilon_t and / or 1+epsilon_t.

Value

The matrix with observations in rows and h steps ahead values in columns. So, the first row corresponds to the forecast produced from the 0th observation from 1 to h steps ahead.

Author(s)

Ivan Svetunkov, ivan@svetunkov.com

Examples


x <- rnorm(100,0,1)
ourModel <- adam(x)
rmultistep(ourModel, h=13)

Simulate Complex Exponential Smoothing

Description

Function generates data using CES with Single Source of Error as a data generating process.

Usage

sim.ces(seasonality = c("none", "simple", "partial", "full"), obs = 10,
  nsim = 1, frequency = 1, a = NULL, b = NULL, initial = NULL,
  randomizer = c("rnorm", "rt", "rlaplace", "rs"), probability = 1, ...)

Arguments

seasonality

The type of seasonality used in CES. Can be: none - No seasonality; simple - Simple seasonality, using lagged CES (based on t-m observation, where m is the seasonality lag); partial - Partial seasonality with real seasonal components (equivalent to additive seasonality); full - Full seasonality with complex seasonal components (can do both multiplicative and additive seasonality, depending on the data). First letter can be used instead of full words. Any seasonal CES can only be constructed for time series vectors.

obs

Number of observations in each generated time series.

nsim

Number of series to generate (number of simulations to do).

frequency

Frequency of generated data. In cases of seasonal models must be greater than 1.

a

First complex smoothing parameter. Should be a complex number.

NOTE! CES is very sensitive to a and b values so it is advised to use values from previously estimated model.

b

Second complex smoothing parameter. Can be real if seasonality="partial". In case of seasonality="full" must be complex number.

initial

A matrix with initial values for CES. In case with seasonality="partial" and seasonality="full" first two columns should contain initial values for non-seasonal components, repeated frequency times.

randomizer

Type of random number generator function used for error term. Defaults are: rnorm, rt, rlaplace and rs. rlnorm should be used for multiplicative models (e.g. ETS(M,N,N)). But any function from Distributions will do the trick if the appropriate parameters are passed. For example rpois with lambda=2 can be used as well, but might result in weird values.

probability

Probability of occurrence, used for intermittent data generation. This can be a vector, implying that probability varies in time (in TSB or Croston style).

...

Details

For the information about the function, see the vignette: vignette("simulate","smooth")

Value

List of the following values is returned:

model - Name of CES model.
a - Value of complex smoothing parameter a. If nsim>1, then this is a vector.
b - Value of complex smoothing parameter b. If seasonality="none" or seasonality="simple", then this is equal to NULL. If nsim>1, then this is a vector.
initial - Initial values of CES in a form of matrix. If nsim>1, then this is an array.
profile - The final profile produced in the simulation.
data - Time series vector (or matrix if nsim>1) of the generated series.
states - Matrix (or array if nsim>1) of states. States are in columns, time is in rows.
residuals - Error terms used in the simulation. Either vector or matrix, depending on nsim.
occurrence - Values of occurrence variable. Once again, can be either a vector or a matrix...
logLik - Log-likelihood of the constructed model.

Author(s)

Ivan Svetunkov, ivan@svetunkov.com

References

Svetunkov, I., Kourentzes, N., & Ord, J. K. (2022). Complex exponential smoothing. Naval Research Logistics, 69(8), 1108–1123. https://doi.org/10.1002/nav.22074

Examples


# Create 120 observations from CES(n). Generate 100 time series of this kind.
x <- sim.ces("n",obs=120,nsim=100)

# Generate similar thing for seasonal series of CES(s)_4
x <- sim.ces("s",frequency=4,obs=80,nsim=100)

# Estimate model and then generate 10 time series from it
ourModel <- ces(rnorm(100,100,5))
simulate(ourModel,nsim=10)

Simulate Exponential Smoothing

Description

Function generates data using ETS with Single Source of Error as a data generating process.

Usage

sim.es(model = "ANN", obs = 10, nsim = 1, frequency = 1,
  persistence = NULL, phi = 1, initial = NULL, initialSeason = NULL,
  bounds = c("usual", "admissible", "restricted"), randomizer = c("rnorm",
  "rlnorm", "rt", "rlaplace", "rs"), probability = 1, ...)

Arguments

model

Type of ETS model according to [Hyndman et. al., 2008] taxonomy. Can consist of 3 or 4 chars: ANN, AAN, AAdN, AAA, AAdA, MAdM etc.

obs

Number of observations in each generated time series.

nsim

Number of series to generate (number of simulations to do).

frequency

Frequency of generated data. In cases of seasonal models must be greater than 1.

persistence

Persistence vector, which includes all the smoothing parameters. Must correspond to the chosen model. The maximum length is 3: level, trend and seasonal smoothing parameters. If NULL, values are generated.

phi

Value of damping parameter. If trend is not chosen in the model, the parameter is ignored.

initial

Vector of initial states of level and trend. The maximum length is 2. If NULL, values are generated.

initialSeason

Vector of initial states for seasonal coefficients. Should have length equal to frequency parameter. If NULL, values are generated.

bounds

Type of bounds to use for persistence vector if values are generated. "usual" - bounds from p.156 by Hyndman et. al., 2008. "restricted" - similar to "usual" but with upper bound equal to 0.3. "admissible" - bounds from tables 10.1 and 10.2 of Hyndman et. al., 2008. Using first letter of the type of bounds also works. These bounds are also used for multiplicative models, but the models are much more restrictive, so weird results might be obtained. Be careful!

randomizer

probability

Probability of occurrence, used for intermittent data generation. This can be a vector, implying that probability varies in time (in TSB or Croston style).

...

Additional parameters passed to the chosen randomizer. All the parameters should be passed in the order they are used in chosen randomizer. For example, passing just sd=0.5 to rnorm function will lead to the call rnorm(obs, mean=0.5, sd=1). ATTENTION! When generating the multiplicative errors some tuning might be needed to obtain meaningful data. sd=0.1 is usually already a high value for such models. ALSO NOTE: In case of multiplicative error model, the randomizer will generate 1+e_t error, not e_t. This means that the mean should typically be equal to 1, not zero.

Details

For the information about the function, see the vignette: vignette("simulate","smooth")

Value

List of the following values is returned:

model - Name of ETS model.
data - Time series vector (or matrix if nsim>1) of the generated series.
states - Matrix (or array if nsim>1) of states. States are in columns, time is in rows.
persistence - Vector (or matrix if nsim>1) of smoothing parameters used in the simulation.
phi - Value of damping parameter used in time series generation.
initial - Vector (or matrix) of initial values.
initialSeason - Vector (or matrix) of initial seasonal coefficients.
profile - The final profile produced in the simulation.
probability - vector of probabilities used in the simulation.
intermittent - type of the intermittent model used.
residuals - Error terms used in the simulation. Either vector or matrix, depending on nsim.
occurrence - Values of occurrence variable. Once again, can be either a vector or a matrix...
logLik - Log-likelihood of the constructed model.

Author(s)

Ivan Svetunkov, ivan@svetunkov.com

References

Svetunkov, I., 2023. Smooth Forecasting with the Smooth Package in R. arXiv. doi:10.48550/arXiv.2301.01790
Snyder, R. D., 1985. Recursive Estimation of Dynamic Linear Models. Journal of the Royal Statistical Society, Series B (Methodological) 47 (2), 272-276.
Hyndman, R.J., Koehler, A.B., Ord, J.K., and Snyder, R.D. (2008) Forecasting with exponential smoothing: the state space approach, Springer-Verlag. doi:10.1007/978-3-540-71918-2.

Examples


# Create 40 observations of quarterly data using AAA model with errors from normal distribution
ETSAAA <- sim.es(model="AAA",frequency=4,obs=40,randomizer="rnorm",mean=0,sd=100)

# Create 50 series of quarterly data using AAA model
# with 40 observations each with errors from normal distribution
ETSAAA <- sim.es(model="AAA",frequency=4,obs=40,randomizer="rnorm",mean=0,sd=100,nsim=50)

# Create 50 series of quarterly data using AAdA model
# with 40 observations each with errors from normal distribution
# and smoothing parameters lying in the "admissible" range.
ETSAAA <- sim.es(model="AAA",phi=0.9,frequency=4,obs=40,bounds="admissible",
                  randomizer="rnorm",mean=0,sd=100,nsim=50)

# Create 60 observations of monthly data using ANN model
# with errors from beta distribution
ETSANN <- sim.es(model="ANN",persistence=c(1.5),frequency=12,obs=60,
                  randomizer="rbeta",shape1=1.5,shape2=1.5)
plot(ETSANN$states)

# Create 60 observations of monthly data using MAM model
# with errors from uniform distribution
ETSMAM <- sim.es(model="MAdM",persistence=c(0.3,0.2,0.1),initial=c(2000,50),
           phi=0.8,frequency=12,obs=60,randomizer="runif",min=-0.5,max=0.5)

# Create 80 observations of quarterly data using MMM model
# with predefined initial values and errors from the normal distribution
ETSMMM <- sim.es(model="MMM",persistence=c(0.1,0.1,0.1),initial=c(2000,1),
           initialSeason=c(1.1,1.05,0.9,.95),frequency=4,obs=80,mean=0,sd=0.01)

# Generate intermittent data using AAdN
iETSAAdN <- sim.es("AAdN",obs=30,frequency=1,probability=0.1,initial=c(3,0),phi=0.8)

# Generate iETS(MNN) with TSB style probabilities
oETSMNN <- sim.oes("MNN",obs=50,occurrence="d",persistence=0.2,initial=1,
                   randomizer="rlnorm",meanlog=0,sdlog=0.3)
iETSMNN <- sim.es("MNN",obs=50,frequency=12,persistence=0.2,initial=4,
                  probability=oETSMNN$probability)

Simulate Generalised Exponential Smoothing

Description

Function generates data using GUM with Single Source of Error as a data generating process.

Usage

sim.gum(orders = c(1), lags = c(1), obs = 10, nsim = 1,
  frequency = 1, measurement = NULL, transition = NULL,
  persistence = NULL, initial = NULL, randomizer = c("rnorm", "rt",
  "rlaplace", "rs"), probability = 1, ...)

Arguments

orders

lags

obs

Number of observations in each generated time series.

nsim

Number of series to generate (number of simulations to do).

frequency

Frequency of generated data. In cases of seasonal models must be greater than 1.

measurement

Measurement vector w. If NULL, then randomly generated (between 0 and 1 for stability).

transition

Transition matrix F. Can be provided as a vector. Matrix will be formed using the default matrix(transition,nc,nc), where nc is the number of components in state vector. If NULL, then randomly generated.

persistence

Persistence vector g, containing smoothing parameters. If NULL, then randomly generated.

initial

Vector of initial values for state matrix. If NULL, then generated using advanced, sophisticated technique - uniform distribution.

randomizer

probability

Probability of occurrence, used for intermittent data generation. This can be a vector, implying that probability varies in time (in TSB or Croston style).

...

Details

For the information about the function, see the vignette: vignette("simulate","smooth")

Value

List of the following values is returned:

model - Name of GUM model.
measurement - Matrix w.
transition - Matrix F.
persistence - Persistence vector. This is the place, where smoothing parameters live.
initial - Initial values of GUM in a form of matrix. If nsim>1, then this is an array.
profile - The final profile produced in the simulation.
data - Time series vector (or matrix if nsim>1) of the generated series.
states - Matrix (or array if nsim>1) of states. States are in columns, time is in rows.
residuals - Error terms used in the simulation. Either vector or matrix, depending on nsim.
occurrence - Values of occurrence variable. Once again, can be either a vector or a matrix...
logLik - Log-likelihood of the constructed model.

Author(s)

Ivan Svetunkov, ivan@svetunkov.com

References

Svetunkov I. (2023) Smooth forecasting with the smooth package in R. arXiv:2301.01790. doi:10.48550/arXiv.2301.01790.
Svetunkov I. (2015 - Inf) "smooth" package for R - series of posts about the underlying models and how to use them: https://openforecast.org/category/r-en/smooth/.

Examples


# Create 120 observations from GUM(1[1]). Generate 100 time series of this kind.
x <- sim.gum(orders=c(1),lags=c(1),obs=120,nsim=100)

# Generate similar thing for seasonal series of GUM(1[1],1[4]])
x <- sim.gum(orders=c(1,1),lags=c(1,4),frequency=4,obs=80,nsim=100,transition=c(1,0,0.9,0.9))

# Estimate model and then generate 10 time series from it
ourModel <- gum(rnorm(100,100,5))
simulate(ourModel,nsim=10)

Simulate Occurrence Part of ETS model

Description

Function generates data using ETS with Single Source of Error as a data generating process for the demand occurrence. As the main output it produces probabilities of occurrence.

Usage

sim.oes(model = "MNN", obs = 10, nsim = 1, frequency = 1,
  occurrence = c("odds-ratio", "inverse-odds-ratio", "direct", "general"),
  bounds = c("usual", "admissible", "restricted"), randomizer = c("rlnorm",
  "rinvgauss", "rgamma", "rnorm"), persistence = NULL, phi = 1,
  initial = NULL, initialSeason = NULL, modelB = model,
  persistenceB = persistence, phiB = phi, initialB = initial,
  initialSeasonB = initialSeason, ...)

Arguments

model

Type of ETS model according to [Hyndman et. al., 2008] taxonomy. Can consist of 3 or 4 chars: ANN, AAN, AAdN, AAA, AAdA, MAdM etc. The conventional oETS model assumes that the error term is positive, so "MZZ" models are recommended for this. If you use additive error models, then the function will exponentiate the obtained values before transforming them and getting the probability. This is the type of model A.

obs

Number of observations in each generated time series.

nsim

Number of series to generate (number of simulations to do).

frequency

Frequency of generated data. In cases of seasonal models must be greater than 1.

occurrence

Type of occurrence model. See vignette("oes","smooth") for details.

bounds

randomizer

Type of random number generator function used for error term. It is advised to use rlnorm() or rinvgauss() in case of multiplicative error models. If a randomiser is used, it is advised to specify the parameters in the ellipsis.

persistence

phi

Value of damping parameter. If trend is not chosen in the model, the parameter is ignored.

initial

Vector of initial states of level and trend. The maximum length is 2. If NULL, values are generated.

initialSeason

Vector of initial states for seasonal coefficients. Should have length equal to frequency parameter. If NULL, values are generated.

modelB

Type of model B. This is used in occurrence="general" and occurrence="inverse-odds-ratio".

persistenceB

The persistence vector for the model B.

phiB

Value of damping parameter for the model B.

initialB

Vector of initial states of level and trend for the model B.

initialSeasonB

Vector of initial states for seasonal coefficients for the model B.

...

Additional parameters passed to the chosen randomizer. All the parameters should be passed in the order they are used in chosen randomizer. Both model A and model B share the same parameters for the randomizer.

Details

For the information about the function, see the vignette: vignette("simulate","smooth")

Value

List of the following values is returned:

model - Name of ETS model.
modelA - Model A, generated using sim.es() function;
modelB - Model B, generated using sim.es() function;
probability - The value of probability generated by the model;
occurrence - Type of occurrence used in the model;
logLik - Log-likelihood of the constructed model.

Author(s)

Ivan Svetunkov, ivan@svetunkov.com

References

Svetunkov, I., 2023. Smooth Forecasting with the Smooth Package in R. arXiv. doi:10.48550/arXiv.2301.01790
Snyder, R. D., 1985. Recursive Estimation of Dynamic Linear Models. Journal of the Royal Statistical Society, Series B (Methodological) 47 (2), 272-276.
Hyndman, R.J., Koehler, A.B., Ord, J.K., and Snyder, R.D. (2008) Forecasting with exponential smoothing: the state space approach, Springer-Verlag. doi:10.1007/978-3-540-71918-2.

Examples


# This example uses rinvgauss function from statmod package.
oETSMNNIG <- sim.oes(model="MNN",frequency=12,obs=60,
                               randomizer="rinvgauss",mean=1,dispersion=0.5)

# A simpler example with log normal distribution
oETSMNNlogN <- sim.oes(model="MNN",frequency=12,obs=60,initial=1,
                       randomizer="rlnorm",meanlog=0,sdlog=0.1)

Simulate Simple Moving Average

Description

Function generates data using SMA in a Single Source of Error state space model as a data generating process.

Usage

sim.sma(order = NULL, obs = 10, nsim = 1, frequency = 1,
  initial = NULL, randomizer = c("rnorm", "rt", "rlaplace", "rs"),
  probability = 1, ...)

Arguments

order

Order of the modelled series. If omitted, then a random order from 1 to 100 is selected.

obs

Number of observations in each generated time series.

nsim

Number of series to generate (number of simulations to do).

frequency

Frequency of generated data. In cases of seasonal models must be greater than 1.

initial

Vector of initial states for the model. If NULL, values are generated.

randomizer

probability

Probability of occurrence, used for intermittent data generation. This can be a vector, implying that probability varies in time (in TSB or Croston style).

...

Details

For the information about the function, see the vignette: vignette("simulate","smooth")

Value

List of the following values is returned:

model - Name of SMA model.
data - Time series vector (or matrix if nsim>1) of the generated series.
states - Matrix (or array if nsim>1) of states. States are in columns, time is in rows.
initial - Vector (or matrix) of initial values.
probability - vector of probabilities used in the simulation.
intermittent - type of the intermittent model used.
residuals - Error terms used in the simulation. Either vector or matrix, depending on nsim.
occurrence - Values of occurrence variable. Once again, can be either a vector or a matrix...
logLik - Log-likelihood of the constructed model.

Author(s)

Ivan Svetunkov, ivan@svetunkov.com

References

Svetunkov, I., 2023. Smooth Forecasting with the Smooth Package in R. arXiv. doi:10.48550/arXiv.2301.01790
Snyder, R. D., 1985. Recursive Estimation of Dynamic Linear Models. Journal of the Royal Statistical Society, Series B (Methodological) 47 (2), 272-276.
Hyndman, R.J., Koehler, A.B., Ord, J.K., and Snyder, R.D. (2008) Forecasting with exponential smoothing: the state space approach, Springer-Verlag. doi:10.1007/978-3-540-71918-2.

Examples


# Create 40 observations of quarterly data using AAA model with errors from normal distribution
sma10 <- sim.sma(order=10,frequency=4,obs=40,randomizer="rnorm",mean=0,sd=100)

Simulate SSARIMA

Description

Function generates data using SSARIMA with Single Source of Error as a data generating process.

Usage

sim.ssarima(orders = list(ar = 0, i = 1, ma = 1), lags = 1, obs = 10,
  nsim = 1, frequency = 1, arma = NULL, constant = FALSE,
  initial = NULL, bounds = c("admissible", "none"),
  randomizer = c("rnorm", "rt", "rlaplace", "rs"), probability = 1, ...)

Arguments

orders

lags

Defines lags for the corresponding orders (see examples above). The length of lags must correspond to the length of orders. There is no restrictions on the length of lags vector. It is recommended to order lags ascending.

obs

Number of observations in each generated time series.

nsim

Number of series to generate (number of simulations to do).

frequency

Frequency of generated data. In cases of seasonal models must be greater than 1.

arma

List with ar/ma parameters. The order of parameters should be lag-wise. This means that first all the AR parameters of the firs lag should be passed, then for the second etc. AR of another ssarima can be passed here.

constant

If TRUE, constant term is included in the model. Can also be a number (constant value).

initial

Vector of initial values for state matrix. If NULL, then generated using advanced, sophisticated technique - uniform distribution.

bounds

Type of bounds to use for AR and MA if values are generated. "admissible" - bounds guaranteeing stability and stationarity of SSARIMA. "none" - we generate something, but do not guarantee stationarity and stability. Using first letter of the type of bounds also works.

randomizer

probability

Probability of occurrence, used for intermittent data generation. This can be a vector, implying that probability varies in time (in TSB or Croston style).

...

Details

For the information about the function, see the vignette: vignette("simulate","smooth")

Value

List of the following values is returned:

model - Name of SSARIMA model.
arma - List of AR/MA parameters. If nsim>1, then this is a list of matrices.
constant - Value of constant term. If nsim>1, then this is a vector.
initial - Initial values of SSARIMA. If nsim>1, then this is a matrix.
profile - The final profile produced in the simulation.
data - Time series vector (or matrix if nsim>1) of the generated series.
states - Matrix (or array if nsim>1) of states. States are in columns, time is in rows.
residuals - Error terms used in the simulation. Either vector or matrix, depending on nsim.
occurrence - Values of occurrence variable. Once again, can be either a vector or a matrix...
logLik - Log-likelihood of the constructed model.

Author(s)

Ivan Svetunkov, ivan@svetunkov.com

References

Svetunkov, I., 2023. Smooth Forecasting with the Smooth Package in R. arXiv. doi:10.48550/arXiv.2301.01790
Snyder, R. D., 1985. Recursive Estimation of Dynamic Linear Models. Journal of the Royal Statistical Society, Series B (Methodological) 47 (2), 272-276.
Hyndman, R.J., Koehler, A.B., Ord, J.K., and Snyder, R.D. (2008) Forecasting with exponential smoothing: the state space approach, Springer-Verlag. doi:10.1007/978-3-540-71918-2.

Svetunkov, I., & Boylan, J. E. (2019). State-space ARIMA for supply-chain forecasting. International Journal of Production Research, 0(0), 1–10. doi:10.1080/00207543.2019.1600764

Examples


# Create 120 observations from ARIMA(1,1,1) with drift. Generate 100 time series of this kind.
x <- sim.ssarima(orders=c(1,1,1),obs=120,nsim=100,constant=TRUE)

# Generate similar thing for seasonal series of SARIMA(1,1,1)(0,0,2)_4
x <- sim.ssarima(orders=list(ar=c(1,0),i=c(1,0),ma=c(1,2)),lags=c(1,4),
                 frequency=4,obs=80,nsim=100,constant=FALSE)

# Generate 10 series of high frequency data from SARIMA(1,0,2)_1(0,1,1)_7(1,0,1)_30
x <- sim.ssarima(orders=list(ar=c(1,0,1),i=c(0,1,0),ma=c(2,1,1)),lags=c(1,7,30),
                 obs=360,nsim=10)

Simulate methods for occurrence (om/omg) state-space models

Description

Re-simulates probabilities and occurrence indicators from a fitted om or omg model. The latent ETS is simulated via the shared C++ kernel (the same one simulate.adam uses), the latent series is mapped to a probability via omLinkFunction (or omgLinkFunction for omg), and a binomial draw with that probability gives the 0/1 occurrence series.

Usage

## S3 method for class 'om'
simulate(object, nsim = 1, seed = NULL, obs = nobs(object),
  ...)

## S3 method for class 'omg'
simulate(object, nsim = 1, seed = NULL, obs = nobs(object),
  ...)

Arguments

object

An object of class om (or omg).

nsim

Number of simulated series to draw.

seed

Optional integer; forwarded to set.seed at the start of the simulation. Matches stats::simulate's generic signature. When NULL (default) the global RNG state is used unchanged.

obs

Number of observations per simulated series. Defaults to the in-sample length.

...

Currently unused; kept for forward compatibility.

Details

print() on the returned object dispatches to print.oes.sim via the inherited "oes.sim" class.

Value

An S3 list of class c("om.sim","oes.sim","smooth.sim") (or c("omg.sim","oes.sim","smooth.sim") for omg) with fields:

$probability: Simulated probability series of shape (obs, nsim) – the equivalent of sim.oes()'s $probability output.
$data: 0/1 occurrence indicators of shape (obs, nsim), drawn via rbinom with the simulated probability.
$states, $residuals: Latent state cube and the errors used internally.
$model, $occurrence: Identifiers carried over from the fit.
$latent: Pre-link state-space output – internal, used by simulate.omg to combine sub-models.

Examples

## Not run: 
set.seed(7)
y <- rbinom(120, 1, prob=0.3 + 0.005*(1:120))
m <- om(y, model="MNN", occurrence="odds-ratio", silent=TRUE)
sim <- simulate(m, nsim=5, seed=42)
range(sim$probability)
table(sim$data)

## End(Not run)

Simple Moving Average

Description

Function constructs state space simple moving average of predefined order

Usage

sma(y, order = NULL, ic = c("AICc", "AIC", "BIC", "BICc"), h = 10,
  holdout = FALSE, silent = TRUE, fast = TRUE, ...)

Arguments

y

Vector or ts object, containing data needed to be forecasted.

order

Order of simple moving average. If NULL, then it is selected automatically using information criteria.

ic

The information criterion used in the model selection procedure.

h

Length of forecasting horizon.

holdout

If TRUE, holdout sample of size h is taken from the end of the data.

silent

accepts TRUE and FALSE. If FALSE, the function will print its progress and produce a plot at the end.

fast

if TRUE, then the modified Ternary search is used to find the optimal order of the model. This does not guarantee the optimal solution, but gives a reasonable one (local minimum).

...

Other non-documented parameters. For example parameter model can accept a previously estimated SMA model and use its parameters.

Details

The function constructs AR model in the Single Source of Error state space form based on the idea that:

y_{t} = \frac{1}{n} \sum_{j=1}^n y_{t-j}

which is AR(n) process, that can be modelled using:

y_{t} = w' v_{t-1} + \epsilon_{t}

v_{t} = F v_{t-1} + g \epsilon_{t}

Where v_{t} is a state vector.

For some more information about the model and its implementation, see the vignette: vignette("sma","smooth")

Value

Object of class "smooth" is returned. It contains the list of the following values:

model - the name of the estimated model.
timeElapsed - time elapsed for the construction of the model.
states - the matrix of the fuzzy components of ssarima, where rows correspond to time and cols to states.
transition - matrix F.
persistence - the persistence vector. This is the place, where smoothing parameters live.
measurement - measurement vector of the model.
order - order of moving average.
initial - Initial state vector values.
initialType - Type of initial values used.
nParam - table with the number of estimated / provided parameters. If a previous model was reused, then its initials are reused and the number of provided parameters will take this into account.
fitted - the fitted values.
forecast - the point forecast.
lower - the lower bound of prediction interval. When interval=FALSE then NA is returned.
upper - the higher bound of prediction interval. When interval=FALSE then NA is returned.
residuals - the residuals of the estimated model.
errors - The matrix of 1 to h steps ahead errors. Only returned when the multistep losses are used and semiparametric interval is needed.
s2 - variance of the residuals (taking degrees of freedom into account).
interval - type of interval asked by user.
level - confidence level for interval.
cumulative - whether the produced forecast was cumulative or not.
y - the original data.
holdout - the holdout part of the original data.
ICs - values of information criteria of the model. Includes AIC, AICc, BIC and BICc.
logLik - log-likelihood of the function.
lossValue - Cost function value.
loss - Type of loss function used in the estimation.
accuracy - vector of accuracy measures for the holdout sample. Includes: MPE, MAPE, SMAPE, MASE, sMAE, RelMAE, sMSE and Bias coefficient (based on complex numbers). This is available only when holdout=TRUE.

Author(s)

Ivan Svetunkov, ivan@svetunkov.com

References

Svetunkov I. (2023) Smooth forecasting with the smooth package in R. arXiv:2301.01790. doi:10.48550/arXiv.2301.01790.
Svetunkov I. (2015 - Inf) "smooth" package for R - series of posts about the underlying models and how to use them: https://openforecast.org/category/r-en/smooth/.

Svetunkov, I., & Petropoulos, F. (2017). Old dog, new tricks: a modelling view of simple moving averages. International Journal of Production Research, 7543(January), 1-14. doi:10.1080/00207543.2017.1380326

Examples


# SMA of specific order
ourModel <- sma(rnorm(118,100,3), order=12, h=18, holdout=TRUE)

# SMA of arbitrary order
ourModel <- sma(rnorm(118,100,3), h=18, holdout=TRUE)

plot(forecast(ourModel, h=18, interval="empirical"))

Combination of forecasts of state space models

Description

Function constructs ETS, SSARIMA, CES, GUM and SMA and combines their forecasts using IC weights.

Usage

smoothCombine(y, models = NULL, initial = c("backcasting", "optimal",
  "two-stage", "complete"), ic = c("AICc", "AIC", "BIC", "BICc"),
  loss = c("MSE", "MAE", "HAM", "MSEh", "TMSE", "GTMSE", "MSCE"), h = 10,
  holdout = FALSE, cumulative = FALSE, interval = c("none", "prediction",
  "confidence", "simulated", "approximate", "semiparametric", "nonparametric",
  "empirical", "complete"), level = 0.95, bins = 200,
  intervalCombine = c("quantile", "probability"), bounds = c("usual",
  "admissible", "none"), silent = TRUE, xreg = NULL,
  regressors = c("use", "select"), initialX = NULL, ...)

Arguments

y

Vector or ts object, containing data needed to be forecasted.

models

List of the estimated smooth models to use in the combination. If NULL, then all the models are estimated in the function.

initial

Can be "optimal", meaning that the initial states are optimised, or "backcasting", meaning that the initials are produced using backcasting procedure.

ic

The information criterion used in the model selection procedure.

loss

There are also available analytical approximations for multistep functions: aMSEh, aTMSE and aGTMSE. These can be useful in cases of small samples.

Finally, just for fun the absolute and half analogues of multistep estimators are available: MAEh, TMAE, GTMAE, MACE, TMAE, HAMh, THAM, GTHAM, CHAM.

h

Length of forecasting horizon.

holdout

If TRUE, holdout sample of size h is taken from the end of the data.

cumulative

If TRUE, then the cumulative forecast and prediction interval are produced instead of the normal ones. This is useful for inventory control systems.

interval

Type of interval to construct. This can be:

"none", aka "n" - do not produce prediction interval.
"parametric", "p" - use state-space structure of ETS. In case of mixed models this is done using simulations, which may take longer time than for the pure additive and pure multiplicative models. This type of interval relies on unbiased estimate of in-sample error variance, which divides the sume of squared errors by T-k rather than just T.
"likelihood", "l" - these are the same as "p", but relies on the biased estimate of variance from the likelihood (division by T, not by T-k).
"semiparametric", "sp" - interval based on covariance matrix of 1 to h steps ahead errors and assumption of normal / log-normal distribution (depending on error type).
"nonparametric", "np" - interval based on values from a quantile regression on error matrix (see Taylor and Bunn, 1999). The model used in this process is e[j] = a j^b, where j=1,..,h.

The parameter also accepts TRUE and FALSE. The former means that parametric interval are constructed, while the latter is equivalent to none. If the forecasts of the models were combined, then the interval are combined quantile-wise (Lichtendahl et al., 2013).

level

Confidence level. Defines width of prediction interval.

bins

The number of bins for the prediction interval. The lower value means faster work of the function, but less precise estimates of the quantiles. This needs to be an even number.

intervalCombine

How to average the prediction interval: quantile-wise ("quantile") or probability-wise ("probability").

bounds

silent

accepts TRUE and FALSE. If FALSE, the function will print its progress and produce a plot at the end.

xreg

regressors

The variable defines what to do with the provided xreg: "use" means that all of the data should be used, while "select" means that a selection using ic should be done.

initialX

The vector of initial parameters for exogenous variables. Ignored if xreg is NULL.

...

This currently determines nothing.

timeElapsed - time elapsed for the construction of the model.
initialType - type of the initial values used.
fitted - fitted values of ETS.
quantiles - the 3D array of produced quantiles if interval!="none" with the dimensions: (number of models) x (bins) x (h).
forecast - point forecast of ETS.
lower - lower bound of prediction interval. When interval="none" then NA is returned.
upper - higher bound of prediction interval. When interval="none" then NA is returned.
residuals - residuals of the estimated model.
s2 - variance of the residuals (taking degrees of freedom into account).
interval - type of interval asked by user.
level - confidence level for interval.
cumulative - whether the produced forecast was cumulative or not.
y - original data.
holdout - holdout part of the original data.
xreg - provided vector or matrix of exogenous variables. If regressors="s", then this value will contain only selected exogenous variables.
ICs - values of information criteria of the model. Includes AIC, AICc, BIC and BICc.
accuracy - vector of accuracy measures for the holdout sample. In case of non-intermittent data includes: MPE, MAPE, SMAPE, MASE, sMAE, RelMAE, sMSE and Bias coefficient (based on complex numbers). In case of intermittent data the set of errors will be: sMSE, sPIS, sCE (scaled cumulative error) and Bias coefficient.

Details

The combination of these models using information criteria weights is possible because they are all formulated in Single Source of Error framework. Due to the the complexity of some of the models, the estimation process may take some time. So be patient.

The prediction interval are combined either probability-wise or quantile-wise (Lichtendahl et al., 2013), which may take extra time, because we need to produce all the distributions for all the models. This can be sped up with the smaller value for bins parameter, but the resulting interval may be imprecise.

Author(s)

Ivan Svetunkov, ivan@svetunkov.com

References

Svetunkov, I., 2023. Smooth Forecasting with the Smooth Package in R. arXiv. doi:10.48550/arXiv.2301.01790
Snyder, R. D., 1985. Recursive Estimation of Dynamic Linear Models. Journal of the Royal Statistical Society, Series B (Methodological) 47 (2), 272-276.
Hyndman, R.J., Koehler, A.B., Ord, J.K., and Snyder, R.D. (2008) Forecasting with exponential smoothing: the state space approach, Springer-Verlag. doi:10.1007/978-3-540-71918-2.

Kolassa, S. (2011) Combining exponential smoothing forecasts using Akaike weights. International Journal of Forecasting, 27, pp 238 - 251.
Svetunkov, I., Boylan, J.E., 2023b. Staying Positive: Challenges and Solutions in Using Pure Multiplicative ETS Models. IMA Journal of Management Mathematics. p. 403-425. doi:10.1093/imaman/dpad028

Taylor, J.W. and Bunn, D.W. (1999) A Quantile Regression Approach to Generating Prediction Intervals. Management Science, Vol 45, No 2, pp 225-237.
Lichtendahl Kenneth C., Jr., Grushka-Cockayne Yael, Winkler Robert L., (2013) Is It Better to Average Probabilities or Quantiles? Management Science 59(7):1594-1611. DOI: doi:10.1287/mnsc.1120.1667

Examples


## Not run: ourModel <- smoothCombine(BJsales,interval="p")
plot(ourModel)
## End(Not run)

# models parameter accepts either previously estimated smoothCombine
# or a manually formed list of smooth models estimated in sample:
## Not run: smoothCombine(BJsales,models=ourModel)

## Not run: models <- list(es(BJsales), sma(BJsales))
smoothCombine(BJsales,models=models)
## End(Not run)

Function returns the ultimate answer to any question

Description

You need a description? So what?

Usage

sowhat(...)

Arguments

...

Any number of variables or string with a question.

Details

You need details? So what?

Value

It doesn't return any value, only messages. So what?

Author(s)

Ivan Svetunkov, ivan@svetunkov.com

References

Examples


x <- rnorm(10000,0,1);
sowhat(x);

sowhat("What's the meaning of life?")

sowhat("I don't have a girlfriend.")

Sparse ARMA Model in State Space Form

Description

Implements a Sparse ARMA model in the State Space form. Unlike standard ARIMA which expands polynomials, this function directly maps AR and MA orders to specific lags.

Usage

sparma(data, orders = list(ar = c(1), ma = c(1)), constant = FALSE,
  loss = c("likelihood", "MSE", "MAE", "HAM", "LASSO", "RIDGE", "MSEh",
  "TMSE", "GTMSE", "MSCE", "GPL"), h = 0, holdout = FALSE, arma = NULL,
  initial = c("backcasting", "optimal", "two-stage", "complete"),
  bounds = c("none", "usual", "admissible"), silent = TRUE, ...)

Arguments

data

Vector or ts object with the data

orders

List with vectors for ar and ma, specifying, which AR and MA orders to fit e.g. orders=list(ar=c(1,4), ma=0) will fit ARMA y_{t} = phi_1 y_{t-1} + phi_4 y_{t-4} + e_t

constant

Logical, whether to include a constant term (default: FALSE)

loss

Loss function type.

h

Forecast horizon (default: 0)

holdout

Logical, whether to use holdout sample (default: FALSE)

arma

List with ar and ma parameters if they do not need to be estimated

initial

Initialisation method for states

bounds

Parameter bounds

silent

Logical, whether to suppress output (default: TRUE)

...

Other parameters passed to god knows what.

Details

The model implements:

y_t = phi * y_{t-p} + theta * epsilon_{t-q} + epsilon_t

with a possibility of defining several lags for AR/MA.

Value

Object of class c("adam", "smooth") containing:

model - Model name
timeElapsed - Computation time
data - Input data
holdout - Holdout sample (if applicable)
fitted - Fitted values
residuals - Residuals
forecast - Point forecasts if h>0
states - State matrix
persistence - Persistence vector (g)
transition - Transition matrix (F)
measurement - Measurement matrix (W)
B - Vector of estimated parameters
orders - Orders specified by the user
constant - Constant value (if included)
arma - vector of ARMA parameters
initial - Initial state values
initialType - Type of initialisation
nParam - Number of parameters
logLik - Log-likelihood value
loss - Loss function used in the estimation
lossValue - Value of the loss function
accuracy - Accuracy measures

Examples

## Not run: 
# Fit SpARMA(1,1) model
model <- sparma(BJSales, orders=c(2,1), h=12, holdout=TRUE)

# Provide fixed parameters
model <- sparma(rnorm(100), orders=c(1,1), arma=c(0.7,0.5))

## End(Not run)

State Space ARIMA

Description

Function constructs State Space ARIMA, estimating AR, MA terms and initial states.

Function selects the best State Space ARIMA based on information criteria, using fancy branch and bound mechanism. The resulting model can be not optimal in IC meaning, but it is usually reasonable.

Usage

ssarima(y, orders = list(ar = c(0), i = c(1), ma = c(1)), lags = c(1,
  frequency(y)), constant = FALSE, arma = NULL, model = NULL,
  initial = c("backcasting", "optimal", "two-stage", "complete"),
  loss = c("likelihood", "MSE", "MAE", "HAM", "MSEh", "TMSE", "GTMSE",
  "MSCE", "GPL"), h = 0, holdout = FALSE, bounds = c("admissible",
  "usual", "none"), silent = TRUE, xreg = NULL, regressors = c("use",
  "select", "adapt"), initialX = NULL, ...)

auto.ssarima(y, orders = list(ar = c(3, 3), i = c(2, 1), ma = c(3, 3)),
  lags = c(1, frequency(y)), fast = TRUE, constant = NULL,
  initial = c("backcasting", "optimal", "two-stage", "complete"),
  loss = c("likelihood", "MSE", "MAE", "HAM", "MSEh", "TMSE", "GTMSE",
  "MSCE", "GPL"), ic = c("AICc", "AIC", "BIC", "BICc"), h = 0,
  holdout = FALSE, bounds = c("admissible", "usual", "none"),
  silent = TRUE, xreg = NULL, regressors = c("use", "select", "adapt"),
  ...)

Arguments

y

Vector or ts object, containing data needed to be forecasted.

orders

List of maximum orders to check, containing vector variables ar, i and ma. If a variable is not provided in the list, then it is assumed to be equal to zero. At least one variable should have the same length as lags.

lags

Defines lags for the corresponding orders (see examples). The length of lags must correspond to the length of orders. There is no restrictions on the length of lags vector.

constant

If NULL, then the function will check if constant is needed. if TRUE, then constant is forced in the model. Otherwise constant is not used.

arma

model

A previously estimated ssarima model, if provided, the function will not estimate anything and will use all its parameters.

initial

Can be either character or a vector of initial states. If it is character, then it can be "optimal", meaning that the initial states are optimised, or "backcasting", meaning that the initials are produced using backcasting procedure.

loss

There are also available analytical approximations for multistep functions: aMSEh, aTMSE and aGTMSE. These can be useful in cases of small samples.

Finally, just for fun the absolute and half analogues of multistep estimators are available: MAEh, TMAE, GTMAE, MACE, TMAE, HAMh, THAM, GTHAM, CHAM.

h

Length of forecasting horizon.

holdout

If TRUE, holdout sample of size h is taken from the end of the data.

bounds

What type of bounds to use in the model estimation. The first letter can be used instead of the whole word. In case of ssarima(), the "usual" means restricting AR and MA parameters to lie between -1 and 1.

silent

accepts TRUE and FALSE. If FALSE, the function will print its progress and produce a plot at the end.

xreg

regressors

The variable defines what to do with the provided xreg: "use" means that all of the data should be used, while "select" means that a selection using ic should be done.

initialX

The vector of initial parameters for exogenous variables. Ignored if xreg is NULL.

...

Other non-documented parameters. For example FI=TRUE will make the function also produce Fisher Information matrix, which then can be used to calculated variances of parameters of the model. Maximum orders to check can also be specified separately, however orders variable must be set to NULL: ar.orders - Maximum order of AR term. Can be vector, defining max orders of AR, SAR etc. i.orders - Maximum order of I. Can be vector, defining max orders of I, SI etc. ma.orders - Maximum order of MA term. Can be vector, defining max orders of MA, SMA etc.

fast

If TRUE, then some of the orders of ARIMA are skipped. This is not advised for models with lags greater than 12.

ic

The information criterion to use in the model selection.

Details

The model, implemented in this function, is discussed in Svetunkov & Boylan (2019).

The basic ARIMA(p,d,q) used in the function has the following form:

(1 - B)^d (1 - a_1 B - a_2 B^2 - ... - a_p B^p) y_[t] = (1 + b_1 B + b_2 B^2 + ... + b_q B^q) \epsilon_[t] + c

This model is then transformed into ARIMA in the Single Source of Error State space form (proposed in Snyder, 1985):

y_{t} = w' v_{t-l} + \epsilon_{t}

v_{t} = F v_{t-l} + g_t \epsilon_{t}

where v_{t} is the state vector (defined based on orders) and l is the vector of lags, w_t is the measurement vector (with explanatory variables if provided), F is the transition matrix, g_t is the persistence vector (which includes explanatory variables if they were used).

Due to the flexibility of the model, multiple seasonalities can be used. For example, something crazy like this can be constructed: SARIMA(1,1,1)(0,1,1)[24](2,0,1)[24*7](0,0,1)[24*30], but the estimation may take a lot of time... If you plan estimating a model with more than one seasonality, it is recommended to use msarima instead.

The model selection for SSARIMA is done by the auto.ssarima function.

For some more information about the model and its implementation, see the vignette: vignette("ssarima","smooth")

The function constructs bunch of ARIMAs in Single Source of Error state space form (see ssarima documentation) and selects the best one based on information criterion. The mechanism is described in Svetunkov & Boylan (2019).

Due to the flexibility of the model, multiple seasonalities can be used. For example, something crazy like this can be constructed: SARIMA(1,1,1)(0,1,1)[24](2,0,1)[24*7](0,0,1)[24*30], but the estimation may take a lot of time... It is recommended to use auto.msarima in cases with more than one seasonality and high frequencies.

For some more information about the model and its implementation, see the vignette: vignette("ssarima","smooth")

Value

Object of class "adam" is returned with similar elements to the adam function.

Object of class "smooth" is returned. See ssarima for details.

Author(s)

Ivan Svetunkov, ivan@svetunkov.com

References

Svetunkov I. (2023) Smooth forecasting with the smooth package in R. arXiv:2301.01790. doi:10.48550/arXiv.2301.01790.
Svetunkov I. (2015 - Inf) "smooth" package for R - series of posts about the underlying models and how to use them: https://openforecast.org/category/r-en/smooth/.

Svetunkov, I., 2023. Smooth Forecasting with the Smooth Package in R. arXiv. doi:10.48550/arXiv.2301.01790
Snyder, R. D., 1985. Recursive Estimation of Dynamic Linear Models. Journal of the Royal Statistical Society, Series B (Methodological) 47 (2), 272-276.
Hyndman, R.J., Koehler, A.B., Ord, J.K., and Snyder, R.D. (2008) Forecasting with exponential smoothing: the state space approach, Springer-Verlag. doi:10.1007/978-3-540-71918-2.

Svetunkov, I., 2023. Smooth Forecasting with the Smooth Package in R. arXiv. doi:10.48550/arXiv.2301.01790
Snyder, R. D., 1985. Recursive Estimation of Dynamic Linear Models. Journal of the Royal Statistical Society, Series B (Methodological) 47 (2), 272-276.
Hyndman, R.J., Koehler, A.B., Ord, J.K., and Snyder, R.D. (2008) Forecasting with exponential smoothing: the state space approach, Springer-Verlag. doi:10.1007/978-3-540-71918-2.

Svetunkov, I., Boylan, J.E., 2023a. iETS: State Space Model for Intermittent Demand Forecastings. International Journal of Production Economics. 109013. doi:10.1016/j.ijpe.2023.109013
Teunter R., Syntetos A., Babai Z. (2011). Intermittent demand: Linking forecasting to inventory obsolescence. European Journal of Operational Research, 214, 606-615.
Croston, J. (1972) Forecasting and stock control for intermittent demands. Operational Research Quarterly, 23(3), 289-303.

Svetunkov, I., & Boylan, J. E. (2019). State-space ARIMA for supply-chain forecasting. International Journal of Production Research, 0(0), 1–10. doi:10.1080/00207543.2019.1600764

Examples

# ARIMA(1,1,1) fitted to some data
ourModel <- ssarima(rnorm(118,100,3),orders=list(ar=c(1),i=c(1),ma=c(1)),lags=c(1))

# Model with the same lags and orders, applied to a different data
ssarima(rnorm(118,100,3),orders=orders(ourModel),lags=lags(ourModel))

# The same model applied to a different data
ssarima(rnorm(118,100,3),model=ourModel)

# Example of SARIMA(2,0,0)(1,0,0)[4]
ssarima(rnorm(118,100,3),orders=list(ar=c(2,1)),lags=c(1,4))

# SARIMA(1,1,1)(0,0,1)[4] with different initialisations
ssarima(rnorm(118,100,3),orders=list(ar=c(1),i=c(1),ma=c(1,1)),
        lags=c(1,4),h=18,holdout=TRUE,initial="backcasting")


set.seed(41)
x <- rnorm(118,100,3)

# The best ARIMA for the data
ourModel <- auto.ssarima(x,orders=list(ar=c(2,1),i=c(1,1),ma=c(2,1)),lags=c(1,12),
                                   h=18,holdout=TRUE)

# The other one using optimised states
auto.ssarima(x,orders=list(ar=c(3,2),i=c(2,1),ma=c(3,2)),lags=c(1,12),
                       initial="two",h=18,holdout=TRUE)

summary(ourModel)
forecast(ourModel)
plot(forecast(ourModel))

Package {smooth}

Smooth package

Description

Details

Author(s)

References

See Also

Examples

Error measures for an estimated model

Description

Usage

Arguments

Details

Author(s)

Examples

ADAM is Augmented Dynamic Adaptive Model

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Automatic Occurrence Model Selection

Description

Usage

Arguments

Value

See Also

Examples

Complex Exponential Smoothing

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Centered Moving Average

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Exponential Smoothing in SSOE state space model

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Forecasting time series using smooth functions

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Generalised Univariate Model

Description

Usage

Arguments

Details

Value

Author(s)

References