# # Forecasting

## # Overview

Forecasting is a transformation that predicts future values by extracting trends and recurring patterns from historical data.

Supported forecasting algorithms:

• `Holt-Winters`
• `ARIMA` (Auto-Regressive Integrated Moving Average).
• `SSA` (Singular Spectrum Analysis).
• `baseline`

Unlike other transformations, the forecast returns samples ahead of the selection interval.

The example below produces a forecast for the next day using the Holt-Winters algorithm with auto-detected parameters.

``````"forecast": {
"horizon": {
"interval": {"count": 1, "unit": "DAY"}
},
"hw": {
"auto": true
}
}
``````

For graphical examples, refer to Forecasting settings in Axibase Charts.

## # Regularization

The forecast algorithms need the input series to be regularized which requires a preceding `aggregation`, `interpolation`, or a `group` transformation as in the example below.

``````[{
"startDate": "2019-04-01T00:00:00Z",
"endDate": "2019-04-12T00:00:00Z",
"entity": "nurswgvml007",
"metric": "cpu_busy",
"aggregate": {
"type": "AVG",
"period": { "count": 1, "unit": "HOUR" }
},
"forecast": {
"horizon": {
"interval": { "count": 1, "unit": "DAY" }
},
"hw": { "auto": true },
"range": {"min": 0, "max": 100}
}
}]
``````
• Aggregation
``````"aggregate": {
"type": "AVG",
"period": { "count": 1, "unit": "HOUR" },
"interpolate" : { "type": "LINEAR" }
}
``````
• Interpolation
``````"interpolate" : {
"function": "LINEAR",
"period": { "count": 1, "unit": "HOUR" }
}
``````
• Grouping
``````"group": {
"type": "SUM",
"period": { "count": 1, "unit": "HOUR"},
"interpolate": { "type": "PREVIOUS" }
}
``````
• Auto-aggregation

As an alternative to manually specified period, the forecast can be generated in auto-aggregation mode in which case the period is determined automatically based on the mean sampling period.

``````"forecast": {
"autoAggregate": true,
"aggregationFunction": "AVG",
"horizon": {
"interval": { "count": 1, "unit": "DAY" }
},
"ssa": {}
}
``````

## # Request Fields

The request fields described below must be included inside the `forecast` object.

### # Horizon Fields

The horizon fields specify the length of the forecasting interval. One of the duration fields (`interval`, `length`, `endDate`) is required.

Name Type Description
`interval` object Forecast length specified with `count` and time `unit`.
For example: `{"count": 3, "unit": "DAY"}`.
`length` number Forecast length specified as the number of samples. The frequency of the forecast samples is the same as the frequency of the input samples.
`endDate` string End date until the forecast must be extrapolated. Must be greater then the end date of the selection interval.
ISO format date or calendar keyword.
`startDate` string Start date for the forecast interval. If not set, the forecast starts after the last sample in the input series.
ISO format date or calendar keyword.

The forecast samples are generated starting with the timestamp following the last sample in the input series and not the end of the selection interval itself.

The start date of the forecast interval can be customized with `startDate` setting to compare estimated and actual values.

Examples:

``````"forecast": {
"horizon": {
"interval": {"count": 1, "unit": "DAY"}
}
}
``````
``````"forecast": {
"horizon": {
"length": 10
}
}
``````
``````"forecast": {
"horizon": {
"endDate": "2018-12-15T16:00:00Z"
}
}
``````

### # Regularization Fields

Name Type Description
`autoAggregate` boolean Set to `true` to perform auto-aggregation.
For Holt-Winters and ARIMA the period is determined based on lowest standard deviation. For SSA, the period is based on mean sampling interval.
Default value: `false`.
`aggregationFunction` string Aggregation function applied if auto-aggregation is enabled.
Default value: `AVG`.

### # Control Fields

Name Type Description
`include` array Include input series, forecast, and reconstructed series into response.
Allowed values: `FORECAST`, `HISTORY`, `RECONSTRUCTED`.
Default value: `FORECAST`.
`scoreInterval` object Interval for scoring the produced forecasts in auto-parameter mode. Specified with `count` and time `unit`.
For example: `{"count": 1, "unit": "DAY"}`.
For SSA, the default value is the minimum of the horizon interval and `1/3` of the input series duration.
For ARIMA and Holt-Winters the default value is `1/4` of the input series duration.
`range` object Minimum and maximum value range.
If forecast value exceeds `max`, such value is replaced with `max`. If forecast value is below `min`, such value is replaced with `min`.

Examples:

``````"forecast": {
"include": ["HISTORY", "FORECAST"]
}
``````
``````"forecast": {
"scoreInterval": {"count": 1, "unit": "DAY"}
}
``````
``````"forecast": {
"range": {"min": 0, "max": 100}
}
``````

### # Holt-Winters Fields

The fields described below must be included in the `forecast.hw` object.

Name Type Description
`auto` boolean Generate a forecast using most optimal settings.
If set to `true`, parameters `alpha`, `beta`, `gamma` are detected automatically based on the lowest standard deviation within the score interval.
If set to `false`, parameters `alpha`, `beta`, `gamma` are required.
`period` object Periodicity parameter specified with `count` and time `unit`.
For example: `{"count": 1, "unit": "DAY"}`.
`alpha` number Alpha (data) parameter.
Possible values: `[0, 1]`.
`beta` number Beta (trend) parameter.
Possible values: `[0, 1]`.
`gamma` number Gamma (seasonality) parameter.
Possible values: `[0, 1]`.

Examples:

``````"forecast": {
"hw": {
"auto": true
}
}
``````
``````"forecast": {
"hw": {
"alpha": 0.5,
"beta": 0.3,
"gamma": 0.5
}
}
``````

### # ARIMA Fields

The fields described below must be included in the `forecast.arima` object.

Name Type Description
`auto` boolean Generate a forecast using most optimal settings.
If set to `true`, parameters `p` and `d` are detected automatically based on the lowest standard deviation within the score interval.
If set to `false`, parameters `p` and `d` are required.
`autoRegressionInterval` object Alternative parameter for `p` where `p` is calculated as `auto-regression-interval / interval`.
Specified with `count` and time `unit`.
For example: `{"count": 1, "unit": "DAY"}`.
`p` number Auto-regression parameter.
`d` number Integration parameter.
Possible values: `0` or `1`.

Examples:

``````"forecast": {
"arima": {
"auto": true
}
}
``````
``````"forecast": {
"arima": {
"p": 2,
"d": 0
}
}
``````

### # Baseline Fields

Name Type Description
`period` object The setting determines which input series samples are used to calculate baseline value for the given timestamp. Baseline value at timestamp `t` is calculated as averaged value of input series for timestamps `t - period`, `t - 2 * period`, ... The input series must be regular (or regularized with an aggregator) and the sampling interval must be divisible by `period`. Specified with `count` and time `units`. For example: `{"count": 1, "unit": "DAY"}`.
`count` number Alternative way to specify the `period`: `period = count * spacing`, where `spacing` is the sampling interval.
`function` String Aggregation function used to average values of input series.

Either `period` or `count` setting is required.

Example:

``````"forecast": {
"baseline": {
"period": {"count": 1, "unit": "DAY"},
"function": "AVG"
}
}
``````

### # SSA Fields

The fields described below must be included in the `forecast.ssa` object.

If the `forecast.ssa` object is empty but has no fields, the forecast is generated using most optimal settings based on the lowest standard deviation within the score interval.

#### # Decomposition Parameters

The fields described below must be included in the `forecast.ssa.decompose` object.

Name Type Description
`eigentripleLimit` number Maximum number of eigenvectors extracted from the trajectory matrix during the singular value decomposition (SVD).
Possible values: between `0` and `500`.
If set to `0`, the count is determined automatically.
`method` string The algorithm applied in singular value decomposition (SVD) of the trajectory matrix to extract eigenvectors.
Possible values: `FULL`, `TRUNCATED`, `AUTO`.
`windowLength` number Height (row count) of the trajectory matrix, specified as the % of the sample count in the input series.
Possible values: `(0, 50]`.
Default value: `50`.
`singularValueThreshold` number Threshold, specified in percent, to discard small eigenvectors. Eigenvector with eigenvalue λ is discarded if √λ is less than the specified % of √ sum of all eigenvalues.
Discard if `√λ ÷ √ (∑ λi) < threshold ÷ 100`
If threshold is `0`, no vectors are discarded.
Possible values: `[0, 100)`.

Example:

``````"forecast": {
"ssa": {
"decompose": {
"singularValueThreshold": 0.5,
"windowLength": 50
}
}
}
``````

#### # Auto Grouping Parameters

The fields described below must be included in the `forecast.ssa.group.auto` object.

Name Type Description
`count` number Maximum number of eigenvector groups. The eigenvectors are placed into groups by the clustering method in Auto mode, or using by enumerating eigenvector indexes in Manual mode. The groups are sorted by maximum eigenvalue in descending order and are named with letters `A`, `B`, `C` etc.
If set to `0`, only one group is returned.
`stack` boolean Build groups recursively, starting with the group `A` with maximum eigenvalue, to view the cumulative effect of added eigenvectors. In enabled, group `A` contains its own eigenvectors. Group `B` contains its own eigenvectors as well as eigenvectors from group `A`. Group `C` includes its own eigenvectors as well as eigenvectors from group `A` and `B`, etc.
`union` array Join eigenvectors from automatically created groups into custom groups. Multiple custom groups are separated using comma. Groups within the custom group are enumerated using semi-colon as a separator or hyphen for range. For example, custom group `A;B;D` contains eigenvectors from automatic groups `A`, `B` and `D`. Custom group `A;C-E` contains eigenvectors from automatic groups `A`,`C`,`D`,`E`.

Example:

``````"forecast": {
"ssa": {
"group": {
"auto": {
"count": 3,
"stack": true,
"union": ["A;C-E", "B"]
}
}
}
}
``````

#### # Auto Grouping Clustering Parameters

The fields described below must be included in the `forecast.ssa.group.auto.clustering` object.

Name Type Description
`method` string Algorithm used to place eigenvectors into groups.
Possible values: `HIERARCHICAL`, `XMEANS`, `NOVOSIBIRSK`.
Default value: `HIERARCHICAL`.
`params` object Dictionary (map) of parameters required by the given clustering method.

Example:

``````"forecast": {
"ssa": {
"group": {
"auto": {
"count": 3,
"clustering": {
"method": "XMEANS"
}
}
}
}
}
``````

#### # Manual Grouping Parameters

The fields described below must be included in the `forecast.ssa.group.manual` object.

Name Type Description
`groups` array Join eigenvectors using their index into custom groups. Multiple custom groups are separated using comma. Eigenvectors within the same group are enumerated using semi-colon as a separator or hyphen for range. For example, custom group `1;3-6` contains eigenvectors with indexes `1`, `3`, `4`, `5` and `6`.

Example:

``````"forecast": {
"ssa": {
"group": {
"manual": {
"groups": ["1-10;12", "11;13-"]
}
}
}
}
``````

#### # Reconstruction Parameters

The fields described below must be included in the `forecast.ssa.reconstruct` object.

Name Type Description
`averagingFunction` string Averaging function to calculate anti-diagonal elements of the reconstructed matrix.
Possible values: `AVG`, `MEDIAN`.
Default value: `AVG`.
`fourier` boolean Apply Fourier transform.
Default value: `true`.

Example:

``````"forecast": {
"ssa": {
"reconstruct": {
"averagingFunction": "AVG"
}
}
}
``````

#### # Forecast Parameters

Name Type Description
`method` string Forecast calculation method.
Possible values: `RECURRENT`, `VECTOR`.
Default value: `RECURRENT`.
`base` boolean Input series to which the recurrent formula is applied when calculating the forecast.
Possible values: `RECONSTRUCTED`, `ORIGINAL`.

Example:

``````"forecast": {
"ssa": {
"forecast": {
"method": "RECURRENT"
}
}
}
``````

## # Examples

Generate SSA forecast for 1 day using the default parameters.

``````[{
"metric":"cpu_busy",
"entity":"nurswgvml007",
"aggregate":{"type":"AVG","period":{"count":1,"unit":"HOUR"}},
"startDate": "2018-12-01T00:00:00Z",
"endDate":   "2018-12-08T00:00:00Z",
"forecast": {
"horizon": {
"interval": {"count": 1, "unit": "DAY"}
},
"include": ["HISTORY", "FORECAST"],
"ssa": {}
}
}]
``````

Generate SSA forecast for 1 day and return 3 component groups.

``````[{
"metric":"cpu_busy",
"entity":"nurswgvml007",
"aggregate":{"type":"AVG","period":{"count":1,"unit":"HOUR"}},
"startDate": "2018-12-01T00:00:00Z",
"endDate":   "2018-12-08T00:00:00Z",
"forecast": {
"horizon": {
"interval": {"count": 1, "unit": "DAY"}
},
"include": ["HISTORY", "FORECAST"],
"ssa": {
"group": {
"auto": { "count": 3 }
}
}
}
}]
``````
``````[
{
"entity": "nurswgvml007",
"metric": "cpu_busy",
"tags": {},
"type": "FORECAST",
"transformationOrder": [
"AGGREGATE",
"FORECAST"
],
"aggregate": {
"type": "AVG",
"period": {
"count": 1,
"unit": "HOUR",
"align": "CALENDAR"
},
"interpolate": {
"type": "LINEAR",
"extend": false,
"windowLength": 0
}
},
"forecast": {
"ssa": {
"implementation": "JAVA",
"averagingFunction": "AVG",
"fourier": true,
"svd": "FULL",
"clustering": {
"method": "HIERARCHICAL"
},
"groupCount": 3,
"windowLength": 84,
"singularValuesThreshold": -1,
"matrixNorm": 1366.6989400912828,
"totalEigentripleCount": 84,
"usedEigentripleCount": 30,
"groupedEigentripleCount": 30,
"maxSingularValue": 1248.3197826979504,
"minRetainedSingularValue": 1248.3197826979504,
"scoreStDev": 6,
"groupingType": "AUTO_AND_STACK",
"groupOrder": 1,
"stack": true,
"joinedGroups": [
"A"
],
"eigentripleIndexes": [
1
],
"singularValues": [
1248.3197826979504
]
}
},
"data": [
{
"d": "2018-12-08T00:00:00.000Z",
"v": 16.323130767882148
},
{
"d": "2018-12-08T01:00:00.000Z",
"v": 16.342557100645664
},
{
"d": "2018-12-08T02:00:00.000Z",
"v": 16.361999901649803
},
{
"d": "2018-12-08T03:00:00.000Z",
"v": 16.38146827162976
},
{
"d": "2018-12-08T04:00:00.000Z",
"v": 16.400950500421814
},
{
"d": "2018-12-08T05:00:00.000Z",
"v": 16.420467555096234
}
]
},
{
"entity": "nurswgvml007",
"metric": "cpu_busy",
"tags": {},
"type": "FORECAST",
"transformationOrder": [
"AGGREGATE",
"FORECAST"
],
"aggregate": {
"type": "AVG",
"period": {
"count": 1,
"unit": "HOUR",
"align": "CALENDAR"
},
"interpolate": {
"type": "LINEAR",
"extend": false,
"windowLength": 0
}
},
"forecast": {
"ssa": {
"implementation": "JAVA",
"averagingFunction": "AVG",
"fourier": true,
"svd": "FULL",
"clustering": {
"method": "HIERARCHICAL"
},
"groupCount": 3,
"windowLength": 84,
"singularValuesThreshold": -1,
"matrixNorm": 1366.6989400912828,
"totalEigentripleCount": 84,
"usedEigentripleCount": 30,
"groupedEigentripleCount": 30,
"maxSingularValue": 1248.3197826979504,
"minRetainedSingularValue": 77.62121560723442,
"scoreStDev": 6,
"groupingType": "AUTO_AND_STACK",
"groupOrder": 2,
"stack": true,
"joinedGroups": [
"A",
"B"
],
"eigentripleIndexes": [
1,
2,
3,
4,
5,
6,
7,
8,
9,
10,
11,
12,
13,
14,
15,
16,
17,
18,
19,
20,
21,
22,
23,
24
],
"singularValues": [
1248.3197826979504,
146.94530499128334,
143.92794141473993,
134.99099344398564,
133.7233440792101,
131.9598700252816,
131.15575519259218,
124.69099819793217,
120.63396980913338,
117.99711659180291,
117.3125200770626,
116.82382176224395,
112.31205236484509,
110.74873548967649,
109.1089829354941,
107.200678759671,
99.56623413354185,
97.73271184511265,
96.49813770348236,
92.718762965264,
89.95945733645226,
88.04449833394744,
78.5215925412881,
77.62121560723442
]
}
},
"data": [
{
"d": "2018-12-08T00:00:00.000Z",
"v": 16.941437368091886
},
{
"d": "2018-12-08T01:00:00.000Z",
"v": 12.939489053146213
},
{
"d": "2018-12-08T02:00:00.000Z",
"v": 11.16216931816397
},
{
"d": "2018-12-08T03:00:00.000Z",
"v": 18.375039896564356
},
{
"d": "2018-12-08T04:00:00.000Z",
"v": 13.609606683445598
},
{
"d": "2018-12-08T05:00:00.000Z",
"v": 13.272122376879008
}
]
},
{
"entity": "nurswgvml007",
"metric": "cpu_busy",
"tags": {},
"type": "FORECAST",
"transformationOrder": [
"AGGREGATE",
"FORECAST"
],
"aggregate": {
"type": "AVG",
"period": {
"count": 1,
"unit": "HOUR",
"align": "CALENDAR"
},
"interpolate": {
"type": "LINEAR",
"extend": false,
"windowLength": 0
}
},
"forecast": {
"ssa": {
"implementation": "JAVA",
"averagingFunction": "AVG",
"fourier": true,
"svd": "FULL",
"clustering": {
"method": "HIERARCHICAL"
},
"groupCount": 3,
"windowLength": 84,
"singularValuesThreshold": -1,
"matrixNorm": 1366.6989400912828,
"totalEigentripleCount": 84,
"usedEigentripleCount": 30,
"groupedEigentripleCount": 30,
"maxSingularValue": 1248.3197826979504,
"minRetainedSingularValue": 23.91003219261129,
"scoreStDev": 6,
"groupingType": "AUTO_AND_STACK",
"groupOrder": 3,
"stack": true,
"joinedGroups": [
"A",
"B",
"C"
],
"eigentripleIndexes": [
1,
2,
3,
4,
5,
6,
7,
8,
9,
10,
11,
12,
13,
14,
15,
16,
17,
18,
19,
20,
21,
22,
23,
24,
25,
26,
27,
28,
29,
30
],
"singularValues": [
1248.3197826979504,
146.94530499128334,
143.92794141473993,
134.99099344398564,
133.7233440792101,
131.9598700252816,
131.15575519259218,
124.69099819793217,
120.63396980913338,
117.99711659180291,
117.3125200770626,
116.82382176224395,
112.31205236484509,
110.74873548967649,
109.1089829354941,
107.200678759671,
99.56623413354185,
97.73271184511265,
96.49813770348236,
92.718762965264,
89.95945733645226,
88.04449833394744,
78.5215925412881,
77.62121560723442,
26.21438894527334,
26.08667243107496,
25.381606345570727,
25.1518659560464,
24.97900929304035,
23.91003219261129
]
}
},
"data": [
{
"d": "2018-12-08T00:00:00.000Z",
"v": 16.0076042167122
},
{
"d": "2018-12-08T01:00:00.000Z",
"v": 14.01962309351334
},
{
"d": "2018-12-08T02:00:00.000Z",
"v": 11.958413336437872
},
{
"d": "2018-12-08T03:00:00.000Z",
"v": 17.71256946367641
},
{
"d": "2018-12-08T04:00:00.000Z",
"v": 13.062727400642963
},
{
"d": "2018-12-08T05:00:00.000Z",
"v": 14.110200706263136
}
]
}
]
``````