Forecasting
Overview
Forecasting is a transformation that predicts future values by extracting trends and recurring patterns from historical data.
Supported forecasting algorithms:
Holt-Winters
ARIMA
(Auto-Regressive Integrated Moving Average).SSA
(Singular Spectrum Analysis).baseline
Unlike other transformations, the forecast returns samples ahead of the selection interval.
The example below produces a forecast for the next day using the Holt-Winters algorithm with auto-detected parameters.
"forecast": {
"horizon": {
"interval": {"count": 1, "unit": "DAY"}
},
"hw": {
"auto": true
}
}
For graphical examples, refer to Forecasting settings in Axibase Charts.
Regularization
The forecast algorithms need the input series to be regularized which requires a preceding aggregation
, interpolation
, or a group
transformation as in the example below.
[{
"startDate": "2019-04-01T00:00:00Z",
"endDate": "2019-04-12T00:00:00Z",
"entity": "nurswgvml007",
"metric": "cpu_busy",
"aggregate": {
"type": "AVG",
"period": { "count": 1, "unit": "HOUR" }
},
"forecast": {
"horizon": {
"interval": { "count": 1, "unit": "DAY" }
},
"hw": { "auto": true },
"range": {"min": 0, "max": 100}
}
}]
- Aggregation
"aggregate": {
"type": "AVG",
"period": { "count": 1, "unit": "HOUR" },
"interpolate" : { "type": "LINEAR" }
}
- Interpolation
"interpolate" : {
"function": "LINEAR",
"period": { "count": 1, "unit": "HOUR" }
}
- Grouping
"group": {
"type": "SUM",
"period": { "count": 1, "unit": "HOUR"},
"interpolate": { "type": "PREVIOUS" }
}
- Auto-aggregation
As an alternative to manually specified period, the forecast can be generated in auto-aggregation mode in which case the period is determined automatically based on the mean sampling period.
"forecast": {
"autoAggregate": true,
"aggregationFunction": "AVG",
"horizon": {
"interval": { "count": 1, "unit": "DAY" }
},
"ssa": {}
}
Request Fields
The request fields described below must be included inside the forecast
object.
Horizon Fields
The horizon fields specify the length of the forecasting interval. One of the duration fields (interval
, length
, endDate
) is required.
Name | Type | Description |
---|---|---|
interval | object | Forecast length specified with count and time unit .For example: {"count": 3, "unit": "DAY"} . |
length | number | Forecast length specified as the number of samples. The frequency of the forecast samples is the same as the frequency of the input samples. |
endDate | string | End date until the forecast must be extrapolated. Must be greater then the end date of the selection interval. ISO format date or calendar keyword. |
startDate | string | Start date for the forecast interval. If not set, the forecast starts after the last sample in the input series. ISO format date or calendar keyword. |
The forecast samples are generated starting with the timestamp following the last sample in the input series and not the end of the selection interval itself.
The start date of the forecast interval can be customized with startDate
setting to compare estimated and actual values.
Examples:
"forecast": {
"horizon": {
"interval": {"count": 1, "unit": "DAY"}
}
}
"forecast": {
"horizon": {
"length": 10
}
}
"forecast": {
"horizon": {
"endDate": "2018-12-15T16:00:00Z"
}
}
Regularization Fields
Name | Type | Description |
---|---|---|
autoAggregate | boolean | Set to true to perform auto-aggregation.For Holt-Winters and ARIMA the period is determined based on lowest standard deviation. For SSA, the period is based on mean sampling interval. Default value: false . |
aggregationFunction | string | Aggregation function applied if auto-aggregation is enabled. Default value: AVG . |
Control Fields
Name | Type | Description |
---|---|---|
include | array | Include input series, forecast, and reconstructed series into response. Allowed values: FORECAST , HISTORY , RECONSTRUCTED .Default value: FORECAST . |
scoreInterval | object | Interval for scoring the produced forecasts in auto-parameter mode. Specified with count and time unit .For example: {"count": 1, "unit": "DAY"} .For SSA, the default value is the minimum of the horizon interval and 1/3 of the input series duration.For ARIMA and Holt-Winters the default value is 1/4 of the input series duration. |
range | object | Minimum and maximum value range. If forecast value exceeds max , such value is replaced with max . If forecast value is below min , such value is replaced with min . |
Examples:
"forecast": {
"include": ["HISTORY", "FORECAST"]
}
"forecast": {
"scoreInterval": {"count": 1, "unit": "DAY"}
}
"forecast": {
"range": {"min": 0, "max": 100}
}
Holt-Winters Fields
The fields described below must be included in the forecast.hw
object.
Name | Type | Description |
---|---|---|
auto | boolean | Generate a forecast using most optimal settings. If set to true , parameters alpha , beta , gamma are detected automatically based on the lowest standard deviation within the score interval.If set to false , parameters alpha , beta , gamma are required. |
period | object | Periodicity parameter specified with count and time unit .For example: {"count": 1, "unit": "DAY"} . |
alpha | number | Alpha (data) parameter. Possible values: [0, 1] . |
beta | number | Beta (trend) parameter. Possible values: [0, 1] . |
gamma | number | Gamma (seasonality) parameter. Possible values: [0, 1] . |
Examples:
"forecast": {
"hw": {
"auto": true
}
}
"forecast": {
"hw": {
"alpha": 0.5,
"beta": 0.3,
"gamma": 0.5
}
}
ARIMA Fields
The fields described below must be included in the forecast.arima
object.
Name | Type | Description |
---|---|---|
auto | boolean | Generate a forecast using most optimal settings. If set to true , parameters p and d are detected automatically based on the lowest standard deviation within the score interval.If set to false , parameters p and d are required. |
autoRegressionInterval | object | Alternative parameter for p where p is calculated as auto-regression-interval / interval .Specified with count and time unit .For example: {"count": 1, "unit": "DAY"} . |
p | number | Auto-regression parameter. |
d | number | Integration parameter. Possible values: 0 or 1 . |
Examples:
"forecast": {
"arima": {
"auto": true
}
}
"forecast": {
"arima": {
"p": 2,
"d": 0
}
}
Baseline Fields
Name | Type | Description |
---|---|---|
period | object | The setting determines which input series samples are used to calculate baseline value for the given timestamp. Baseline value at timestamp t is calculated as averaged value of input series for timestamps t - period , t - 2 * period , ... The input series must be regular (or regularized with an aggregator) and the sampling interval must be divisible by period . Specified with count and time units . For example: {"count": 1, "unit": "DAY"} . |
count | number | Alternative way to specify the period : period = count * spacing , where spacing is the sampling interval. |
function | String | Aggregation function used to average values of input series. |
Either
period
orcount
setting is required.
Example:
"forecast": {
"baseline": {
"period": {"count": 1, "unit": "DAY"},
"function": "AVG"
}
}
SSA Fields
The fields described below must be included in the forecast.ssa
object.
If the forecast.ssa
object is empty but has no fields, the forecast is generated using most optimal settings based on the lowest standard deviation within the score interval.
Decomposition Parameters
The fields described below must be included in the forecast.ssa.decompose
object.
Name | Type | Description |
---|---|---|
eigentripleLimit | number | Maximum number of eigenvectors extracted from the trajectory matrix during the singular value decomposition (SVD). Possible values: between 0 and 500 .If set to 0 , the count is determined automatically. |
method | string | The algorithm applied in singular value decomposition (SVD) of the trajectory matrix to extract eigenvectors. Possible values: FULL , TRUNCATED , AUTO . |
windowLength | number | Height (row count) of the trajectory matrix, specified as the % of the sample count in the input series. Possible values: (0, 50] .Default value: 50 . |
singularValueThreshold | number | Threshold, specified in percent, to discard small eigenvectors. Eigenvector with eigenvalue λ is discarded if √λ is less than the specified % of √ sum of all eigenvalues. Discard if √λ ÷ √ (∑ λi) < threshold ÷ 100 If threshold is 0 , no vectors are discarded.Possible values: [0, 100) . |
Example:
"forecast": {
"ssa": {
"decompose": {
"singularValueThreshold": 0.5,
"windowLength": 50
}
}
}
Auto Grouping Parameters
The fields described below must be included in the forecast.ssa.group.auto
object.
Name | Type | Description |
---|---|---|
count | number | Maximum number of eigenvector groups. The eigenvectors are placed into groups by the clustering method in Auto mode, or using by enumerating eigenvector indexes in Manual mode. The groups are sorted by maximum eigenvalue in descending order and are named with letters A , B , C etc.If set to 0 , only one group is returned. |
stack | boolean | Build groups recursively, starting with the group A with maximum eigenvalue, to view the cumulative effect of added eigenvectors. In enabled, group A contains its own eigenvectors. Group B contains its own eigenvectors as well as eigenvectors from group A . Group C includes its own eigenvectors as well as eigenvectors from group A and B , etc. |
union | array | Join eigenvectors from automatically created groups into custom groups. Multiple custom groups are separated using comma. Groups within the custom group are enumerated using semi-colon as a separator or hyphen for range. For example, custom group A;B;D contains eigenvectors from automatic groups A , B and D . Custom group A;C-E contains eigenvectors from automatic groups A ,C ,D ,E . |
Example:
"forecast": {
"ssa": {
"group": {
"auto": {
"count": 3,
"stack": true,
"union": ["A;C-E", "B"]
}
}
}
}
Auto Grouping Clustering Parameters
The fields described below must be included in the forecast.ssa.group.auto.clustering
object.
Name | Type | Description |
---|---|---|
method | string | Algorithm used to place eigenvectors into groups. Possible values: HIERARCHICAL , XMEANS , NOVOSIBIRSK .Default value: HIERARCHICAL . |
params | object | Dictionary (map) of parameters required by the given clustering method. |
Example:
"forecast": {
"ssa": {
"group": {
"auto": {
"count": 3,
"clustering": {
"method": "XMEANS"
}
}
}
}
}
Manual Grouping Parameters
The fields described below must be included in the forecast.ssa.group.manual
object.
Name | Type | Description |
---|---|---|
groups | array | Join eigenvectors using their index into custom groups. Multiple custom groups are separated using comma. Eigenvectors within the same group are enumerated using semi-colon as a separator or hyphen for range. For example, custom group 1;3-6 contains eigenvectors with indexes 1 , 3 , 4 , 5 and 6 . |
Example:
"forecast": {
"ssa": {
"group": {
"manual": {
"groups": ["1-10;12", "11;13-"]
}
}
}
}
Reconstruction Parameters
The fields described below must be included in the forecast.ssa.reconstruct
object.
Name | Type | Description |
---|---|---|
averagingFunction | string | Averaging function to calculate anti-diagonal elements of the reconstructed matrix. Possible values: AVG , MEDIAN .Default value: AVG . |
fourier | boolean | Apply Fourier transform. Default value: true . |
Example:
"forecast": {
"ssa": {
"reconstruct": {
"averagingFunction": "AVG"
}
}
}
Forecast Parameters
Name | Type | Description |
---|---|---|
method | string | Forecast calculation method. Possible values: RECURRENT , VECTOR .Default value: RECURRENT . |
base | boolean | Input series to which the recurrent formula is applied when calculating the forecast. Possible values: RECONSTRUCTED , ORIGINAL . |
Example:
"forecast": {
"ssa": {
"forecast": {
"method": "RECURRENT"
}
}
}
Examples
Generate SSA forecast for 1 day using the default parameters.
[{
"metric":"cpu_busy",
"entity":"nurswgvml007",
"aggregate":{"type":"AVG","period":{"count":1,"unit":"HOUR"}},
"startDate": "2018-12-01T00:00:00Z",
"endDate": "2018-12-08T00:00:00Z",
"forecast": {
"horizon": {
"interval": {"count": 1, "unit": "DAY"}
},
"include": ["HISTORY", "FORECAST"],
"ssa": {}
}
}]
Generate SSA forecast for 1 day and return 3 component groups.
[{
"metric":"cpu_busy",
"entity":"nurswgvml007",
"aggregate":{"type":"AVG","period":{"count":1,"unit":"HOUR"}},
"startDate": "2018-12-01T00:00:00Z",
"endDate": "2018-12-08T00:00:00Z",
"forecast": {
"horizon": {
"interval": {"count": 1, "unit": "DAY"}
},
"include": ["HISTORY", "FORECAST"],
"ssa": {
"group": {
"auto": { "count": 3 }
}
}
}
}]
[
{
"entity": "nurswgvml007",
"metric": "cpu_busy",
"tags": {},
"type": "FORECAST",
"transformationOrder": [
"AGGREGATE",
"FORECAST"
],
"aggregate": {
"type": "AVG",
"period": {
"count": 1,
"unit": "HOUR",
"align": "CALENDAR"
},
"interpolate": {
"type": "LINEAR",
"extend": false,
"windowLength": 0
}
},
"forecast": {
"ssa": {
"implementation": "JAVA",
"averagingFunction": "AVG",
"fourier": true,
"svd": "FULL",
"clustering": {
"method": "HIERARCHICAL"
},
"groupCount": 3,
"windowLength": 84,
"singularValuesThreshold": -1,
"matrixNorm": 1366.6989400912828,
"totalEigentripleCount": 84,
"usedEigentripleCount": 30,
"discardedEigentripleCount": 54,
"groupedEigentripleCount": 30,
"maxSingularValue": 1248.3197826979504,
"discardedSingularValue": 22.215178225695876,
"minRetainedSingularValue": 1248.3197826979504,
"scoreStDev": 6,
"groupingType": "AUTO_AND_STACK",
"groupOrder": 1,
"stack": true,
"joinedGroups": [
"A"
],
"eigentripleIndexes": [
1
],
"singularValues": [
1248.3197826979504
]
}
},
"data": [
{
"d": "2018-12-08T00:00:00.000Z",
"v": 16.323130767882148
},
{
"d": "2018-12-08T01:00:00.000Z",
"v": 16.342557100645664
},
{
"d": "2018-12-08T02:00:00.000Z",
"v": 16.361999901649803
},
{
"d": "2018-12-08T03:00:00.000Z",
"v": 16.38146827162976
},
{
"d": "2018-12-08T04:00:00.000Z",
"v": 16.400950500421814
},
{
"d": "2018-12-08T05:00:00.000Z",
"v": 16.420467555096234
}
]
},
{
"entity": "nurswgvml007",
"metric": "cpu_busy",
"tags": {},
"type": "FORECAST",
"transformationOrder": [
"AGGREGATE",
"FORECAST"
],
"aggregate": {
"type": "AVG",
"period": {
"count": 1,
"unit": "HOUR",
"align": "CALENDAR"
},
"interpolate": {
"type": "LINEAR",
"extend": false,
"windowLength": 0
}
},
"forecast": {
"ssa": {
"implementation": "JAVA",
"averagingFunction": "AVG",
"fourier": true,
"svd": "FULL",
"clustering": {
"method": "HIERARCHICAL"
},
"groupCount": 3,
"windowLength": 84,
"singularValuesThreshold": -1,
"matrixNorm": 1366.6989400912828,
"totalEigentripleCount": 84,
"usedEigentripleCount": 30,
"discardedEigentripleCount": 54,
"groupedEigentripleCount": 30,
"maxSingularValue": 1248.3197826979504,
"discardedSingularValue": 22.215178225695876,
"minRetainedSingularValue": 77.62121560723442,
"scoreStDev": 6,
"groupingType": "AUTO_AND_STACK",
"groupOrder": 2,
"stack": true,
"joinedGroups": [
"A",
"B"
],
"eigentripleIndexes": [
1,
2,
3,
4,
5,
6,
7,
8,
9,
10,
11,
12,
13,
14,
15,
16,
17,
18,
19,
20,
21,
22,
23,
24
],
"singularValues": [
1248.3197826979504,
146.94530499128334,
143.92794141473993,
134.99099344398564,
133.7233440792101,
131.9598700252816,
131.15575519259218,
124.69099819793217,
120.63396980913338,
117.99711659180291,
117.3125200770626,
116.82382176224395,
112.31205236484509,
110.74873548967649,
109.1089829354941,
107.200678759671,
99.56623413354185,
97.73271184511265,
96.49813770348236,
92.718762965264,
89.95945733645226,
88.04449833394744,
78.5215925412881,
77.62121560723442
]
}
},
"data": [
{
"d": "2018-12-08T00:00:00.000Z",
"v": 16.941437368091886
},
{
"d": "2018-12-08T01:00:00.000Z",
"v": 12.939489053146213
},
{
"d": "2018-12-08T02:00:00.000Z",
"v": 11.16216931816397
},
{
"d": "2018-12-08T03:00:00.000Z",
"v": 18.375039896564356
},
{
"d": "2018-12-08T04:00:00.000Z",
"v": 13.609606683445598
},
{
"d": "2018-12-08T05:00:00.000Z",
"v": 13.272122376879008
}
]
},
{
"entity": "nurswgvml007",
"metric": "cpu_busy",
"tags": {},
"type": "FORECAST",
"transformationOrder": [
"AGGREGATE",
"FORECAST"
],
"aggregate": {
"type": "AVG",
"period": {
"count": 1,
"unit": "HOUR",
"align": "CALENDAR"
},
"interpolate": {
"type": "LINEAR",
"extend": false,
"windowLength": 0
}
},
"forecast": {
"ssa": {
"implementation": "JAVA",
"averagingFunction": "AVG",
"fourier": true,
"svd": "FULL",
"clustering": {
"method": "HIERARCHICAL"
},
"groupCount": 3,
"windowLength": 84,
"singularValuesThreshold": -1,
"matrixNorm": 1366.6989400912828,
"totalEigentripleCount": 84,
"usedEigentripleCount": 30,
"discardedEigentripleCount": 54,
"groupedEigentripleCount": 30,
"maxSingularValue": 1248.3197826979504,
"discardedSingularValue": 22.215178225695876,
"minRetainedSingularValue": 23.91003219261129,
"scoreStDev": 6,
"groupingType": "AUTO_AND_STACK",
"groupOrder": 3,
"stack": true,
"joinedGroups": [
"A",
"B",
"C"
],
"eigentripleIndexes": [
1,
2,
3,
4,
5,
6,
7,
8,
9,
10,
11,
12,
13,
14,
15,
16,
17,
18,
19,
20,
21,
22,
23,
24,
25,
26,
27,
28,
29,
30
],
"singularValues": [
1248.3197826979504,
146.94530499128334,
143.92794141473993,
134.99099344398564,
133.7233440792101,
131.9598700252816,
131.15575519259218,
124.69099819793217,
120.63396980913338,
117.99711659180291,
117.3125200770626,
116.82382176224395,
112.31205236484509,
110.74873548967649,
109.1089829354941,
107.200678759671,
99.56623413354185,
97.73271184511265,
96.49813770348236,
92.718762965264,
89.95945733645226,
88.04449833394744,
78.5215925412881,
77.62121560723442,
26.21438894527334,
26.08667243107496,
25.381606345570727,
25.1518659560464,
24.97900929304035,
23.91003219261129
]
}
},
"data": [
{
"d": "2018-12-08T00:00:00.000Z",
"v": 16.0076042167122
},
{
"d": "2018-12-08T01:00:00.000Z",
"v": 14.01962309351334
},
{
"d": "2018-12-08T02:00:00.000Z",
"v": 11.958413336437872
},
{
"d": "2018-12-08T03:00:00.000Z",
"v": 17.71256946367641
},
{
"d": "2018-12-08T04:00:00.000Z",
"v": 13.062727400642963
},
{
"d": "2018-12-08T05:00:00.000Z",
"v": 14.110200706263136
}
]
}
]