Forecasting

Overview

Forecasting is a transformation that predicts future values by extracting trends and recurring patterns from historical data.

Supported forecasting algorithms:

Holt-Winters
ARIMA (Auto-Regressive Integrated Moving Average).
SSA (Singular Spectrum Analysis).
baseline

Unlike other transformations, the forecast returns samples ahead of the selection interval.

The example below produces a forecast for the next day using the Holt-Winters algorithm with auto-detected parameters.

"forecast": {
  "horizon": {
    "interval": {"count": 1, "unit": "DAY"}
  },
  "hw": {
    "auto": true
  }
}

For graphical examples, refer to Forecasting settings in Axibase Charts.

Regularization

The forecast algorithms need the input series to be regularized which requires a preceding aggregation, interpolation, or a group transformation as in the example below.

[{
  "startDate": "2019-04-01T00:00:00Z",
  "endDate": "2019-04-12T00:00:00Z",
  "entity": "nurswgvml007",
  "metric": "cpu_busy",
  "aggregate": {
    "type": "AVG",
    "period": { "count": 1, "unit": "HOUR" }
  },
  "forecast": {
    "horizon": {
      "interval": { "count": 1, "unit": "DAY" }
    },
    "hw": { "auto": true },
    "range": {"min": 0, "max": 100}
  }
}]

Aggregation

"aggregate": {
  "type": "AVG",
  "period": { "count": 1, "unit": "HOUR" },
  "interpolate" : { "type": "LINEAR" }
}

Interpolation

"interpolate" : {
  "function": "LINEAR",
  "period": { "count": 1, "unit": "HOUR" }
}

Grouping

"group": {
  "type": "SUM",
  "period": { "count": 1, "unit": "HOUR"},
  "interpolate": { "type": "PREVIOUS" }
}

Auto-aggregation

As an alternative to manually specified period, the forecast can be generated in auto-aggregation mode in which case the period is determined automatically based on the mean sampling period.

"forecast": {
  "autoAggregate": true,
  "aggregationFunction": "AVG",
  "horizon": {
    "interval": { "count": 1, "unit": "DAY" }
  },
  "ssa": {}
}

Request Fields

The request fields described below must be included inside the forecast object.

Horizon Fields

The horizon fields specify the length of the forecasting interval. One of the duration fields (interval, length, endDate) is required.

Name	Type	Description
`interval`	object	Forecast length specified with `count` and time `unit`. For example: `{"count": 3, "unit": "DAY"}`.
`length`	number	Forecast length specified as the number of samples. The frequency of the forecast samples is the same as the frequency of the input samples.
`endDate`	string	End date until the forecast must be extrapolated. Must be greater then the end date of the selection interval. ISO format date or calendar keyword.
`startDate`	string	Start date for the forecast interval. If not set, the forecast starts after the last sample in the input series. ISO format date or calendar keyword.

The forecast samples are generated starting with the timestamp following the last sample in the input series and not the end of the selection interval itself.

The start date of the forecast interval can be customized with startDate setting to compare estimated and actual values.

Examples:

"forecast": {
  "horizon": {
    "interval": {"count": 1, "unit": "DAY"}
  }
}

"forecast": {
  "horizon": {
    "length": 10
  }
}

"forecast": {
  "horizon": {
    "endDate": "2018-12-15T16:00:00Z"
  }
}

Regularization Fields

Name	Type	Description
`autoAggregate`	boolean	Set to `true` to perform auto-aggregation. For Holt-Winters and ARIMA the period is determined based on lowest standard deviation. For SSA, the period is based on mean sampling interval. Default value: `false`.
`aggregationFunction`	string	Aggregation function applied if auto-aggregation is enabled. Default value: `AVG`.

Control Fields

Name	Type	Description
`include`	array	Include input series, forecast, and reconstructed series into response. Allowed values: `FORECAST`, `HISTORY`, `RECONSTRUCTED`. Default value: `FORECAST`.
`scoreInterval`	object	Interval for scoring the produced forecasts in auto-parameter mode. Specified with `count` and time `unit`. For example: `{"count": 1, "unit": "DAY"}`. For SSA, the default value is the minimum of the horizon interval and `1/3` of the input series duration. For ARIMA and Holt-Winters the default value is `1/4` of the input series duration.
`range`	object	Minimum and maximum value range. If forecast value exceeds `max`, such value is replaced with `max`. If forecast value is below `min`, such value is replaced with `min`.

Examples:

"forecast": {
  "include": ["HISTORY", "FORECAST"]
}

"forecast": {
  "scoreInterval": {"count": 1, "unit": "DAY"}
}

"forecast": {
  "range": {"min": 0, "max": 100}
}

Holt-Winters Fields

The fields described below must be included in the forecast.hw object.

Name	Type	Description
`auto`	boolean	Generate a forecast using most optimal settings. If set to `true`, parameters `alpha`, `beta`, `gamma` are detected automatically based on the lowest standard deviation within the score interval. If set to `false`, parameters `alpha`, `beta`, `gamma` are required.
`period`	object	Periodicity parameter specified with `count` and time `unit`. For example: `{"count": 1, "unit": "DAY"}`.
`alpha`	number	Alpha (data) parameter. Possible values: `[0, 1]`.
`beta`	number	Beta (trend) parameter. Possible values: `[0, 1]`.
`gamma`	number	Gamma (seasonality) parameter. Possible values: `[0, 1]`.

Examples:

"forecast": {
  "hw": {
    "auto": true
  }
}

"forecast": {
  "hw": {
    "alpha": 0.5,
    "beta": 0.3,
    "gamma": 0.5
  }
}

ARIMA Fields

The fields described below must be included in the forecast.arima object.

Name	Type	Description
`auto`	boolean	Generate a forecast using most optimal settings. If set to `true`, parameters `p` and `d` are detected automatically based on the lowest standard deviation within the score interval. If set to `false`, parameters `p` and `d` are required.
`autoRegressionInterval`	object	Alternative parameter for `p` where `p` is calculated as `auto-regression-interval / interval`. Specified with `count` and time `unit`. For example: `{"count": 1, "unit": "DAY"}`.
`p`	number	Auto-regression parameter.
`d`	number	Integration parameter. Possible values: `0` or `1`.

Examples:

"forecast": {
  "arima": {
    "auto": true
  }
}

"forecast": {
  "arima": {
    "p": 2,
    "d": 0
  }
}

Baseline Fields

Name	Type	Description
`period`	object	The setting determines which input series samples are used to calculate baseline value for the given timestamp. Baseline value at timestamp `t` is calculated as averaged value of input series for timestamps `t - period`, `t - 2 * period`, ... The input series must be regular (or regularized with an aggregator) and the sampling interval must be divisible by `period`. Specified with `count` and time `units`. For example: `{"count": 1, "unit": "DAY"}`.
`count`	number	Alternative way to specify the `period`: `period = count * spacing`, where `spacing` is the sampling interval.
`function`	String	Aggregation function used to average values of input series.

Either period or count setting is required.

Example:

"forecast": {
  "baseline": {
    "period": {"count": 1, "unit": "DAY"},
    "function": "AVG"
  }
}

SSA Fields

The fields described below must be included in the forecast.ssa object.

If the forecast.ssa object is empty but has no fields, the forecast is generated using most optimal settings based on the lowest standard deviation within the score interval.

Decomposition Parameters

The fields described below must be included in the forecast.ssa.decompose object.

Name	Type	Description
`eigentripleLimit`	number	Maximum number of eigenvectors extracted from the trajectory matrix during the singular value decomposition (SVD). Possible values: between `0` and `500`. If set to `0`, the count is determined automatically.
`method`	string	The algorithm applied in singular value decomposition (SVD) of the trajectory matrix to extract eigenvectors. Possible values: `FULL`, `TRUNCATED`, `AUTO`.
`windowLength`	number	Height (row count) of the trajectory matrix, specified as the % of the sample count in the input series. Possible values: `(0, 50]`. Default value: `50`.
`singularValueThreshold`	number	Threshold, specified in percent, to discard small eigenvectors. Eigenvector with eigenvalue λ is discarded if √λ is less than the specified % of √ sum of all eigenvalues. Discard if `√λ ÷ √ (∑ λi) < threshold ÷ 100` If threshold is `0`, no vectors are discarded. Possible values: `[0, 100)`.

Example:

"forecast": {
  "ssa": {
    "decompose": {
      "singularValueThreshold": 0.5,
      "windowLength": 50
    }
  }
}

Auto Grouping Parameters

The fields described below must be included in the forecast.ssa.group.auto object.

Name	Type	Description
`count`	number	Maximum number of eigenvector groups. The eigenvectors are placed into groups by the clustering method in Auto mode, or using by enumerating eigenvector indexes in Manual mode. The groups are sorted by maximum eigenvalue in descending order and are named with letters `A`, `B`, `C` etc. If set to `0`, only one group is returned.
`stack`	boolean	Build groups recursively, starting with the group `A` with maximum eigenvalue, to view the cumulative effect of added eigenvectors. In enabled, group `A` contains its own eigenvectors. Group `B` contains its own eigenvectors as well as eigenvectors from group `A`. Group `C` includes its own eigenvectors as well as eigenvectors from group `A` and `B`, etc.
`union`	array	Join eigenvectors from automatically created groups into custom groups. Multiple custom groups are separated using comma. Groups within the custom group are enumerated using semi-colon as a separator or hyphen for range. For example, custom group `A;B;D` contains eigenvectors from automatic groups `A`, `B` and `D`. Custom group `A;C-E` contains eigenvectors from automatic groups `A`,`C`,`D`,`E`.

Example:

"forecast": {
  "ssa": {
    "group": {
      "auto": {
        "count": 3,
        "stack": true,
        "union": ["A;C-E", "B"]
      }
    }
  }
}

Auto Grouping Clustering Parameters

The fields described below must be included in the forecast.ssa.group.auto.clustering object.

Name	Type	Description
`method`	string	Algorithm used to place eigenvectors into groups. Possible values: `HIERARCHICAL`, `XMEANS`, `NOVOSIBIRSK`. Default value: `HIERARCHICAL`.
`params`	object	Dictionary (map) of parameters required by the given clustering method.

Example:

"forecast": {
  "ssa": {
    "group": {
      "auto": {
        "count": 3,
        "clustering": {
          "method": "XMEANS"
        }
      }
    }
  }
}

Manual Grouping Parameters

The fields described below must be included in the forecast.ssa.group.manual object.

Name	Type	Description
`groups`	array	Join eigenvectors using their index into custom groups. Multiple custom groups are separated using comma. Eigenvectors within the same group are enumerated using semi-colon as a separator or hyphen for range. For example, custom group `1;3-6` contains eigenvectors with indexes `1`, `3`, `4`, `5` and `6`.

Example:

"forecast": {
  "ssa": {
    "group": {
      "manual": {
        "groups": ["1-10;12", "11;13-"]
      }
    }
  }
}

Reconstruction Parameters

The fields described below must be included in the forecast.ssa.reconstruct object.

Name	Type	Description
`averagingFunction`	string	Averaging function to calculate anti-diagonal elements of the reconstructed matrix. Possible values: `AVG`, `MEDIAN`. Default value: `AVG`.
`fourier`	boolean	Apply Fourier transform. Default value: `true`.

Example:

"forecast": {
  "ssa": {
    "reconstruct": {
      "averagingFunction": "AVG"
    }
  }
}

Forecast Parameters

Name	Type	Description
`method`	string	Forecast calculation method. Possible values: `RECURRENT`, `VECTOR`. Default value: `RECURRENT`.
`base`	boolean	Input series to which the recurrent formula is applied when calculating the forecast. Possible values: `RECONSTRUCTED`, `ORIGINAL`.

Example:

"forecast": {
  "ssa": {
    "forecast": {
      "method": "RECURRENT"
    }
  }
}

Examples

Generate SSA forecast for 1 day using the default parameters.

[{
  "metric":"cpu_busy",
  "entity":"nurswgvml007",
  "aggregate":{"type":"AVG","period":{"count":1,"unit":"HOUR"}},
  "startDate": "2018-12-01T00:00:00Z",
  "endDate":   "2018-12-08T00:00:00Z",
  "forecast": {
    "horizon": {
      "interval": {"count": 1, "unit": "DAY"}
    },
    "include": ["HISTORY", "FORECAST"],
    "ssa": {}
  }
}]

Generate SSA forecast for 1 day and return 3 component groups.

[{
  "metric":"cpu_busy",
  "entity":"nurswgvml007",
  "aggregate":{"type":"AVG","period":{"count":1,"unit":"HOUR"}},
  "startDate": "2018-12-01T00:00:00Z",
  "endDate":   "2018-12-08T00:00:00Z",
  "forecast": {
    "horizon": {
      "interval": {"count": 1, "unit": "DAY"}
    },
    "include": ["HISTORY", "FORECAST"],
    "ssa": {
      "group": {
        "auto": { "count": 3 }
      }
    }
  }
}]

[
  {
    "entity": "nurswgvml007",
    "metric": "cpu_busy",
    "tags": {},
    "type": "FORECAST",
    "transformationOrder": [
      "AGGREGATE",
      "FORECAST"
    ],
    "aggregate": {
      "type": "AVG",
      "period": {
        "count": 1,
        "unit": "HOUR",
        "align": "CALENDAR"
      },
      "interpolate": {
        "type": "LINEAR",
        "extend": false,
        "windowLength": 0
      }
    },
    "forecast": {
      "ssa": {
        "implementation": "JAVA",
        "averagingFunction": "AVG",
        "fourier": true,
        "svd": "FULL",
        "clustering": {
          "method": "HIERARCHICAL"
        },
        "groupCount": 3,
        "windowLength": 84,
        "singularValuesThreshold": -1,
        "matrixNorm": 1366.6989400912828,
        "totalEigentripleCount": 84,
        "usedEigentripleCount": 30,
        "discardedEigentripleCount": 54,
        "groupedEigentripleCount": 30,
        "maxSingularValue": 1248.3197826979504,
        "discardedSingularValue": 22.215178225695876,
        "minRetainedSingularValue": 1248.3197826979504,
        "scoreStDev": 6,
        "groupingType": "AUTO_AND_STACK",
        "groupOrder": 1,
        "stack": true,
        "joinedGroups": [
          "A"
        ],
        "eigentripleIndexes": [
          1
        ],
        "singularValues": [
          1248.3197826979504
        ]
      }
    },
    "data": [
      {
        "d": "2018-12-08T00:00:00.000Z",
        "v": 16.323130767882148
      },
      {
        "d": "2018-12-08T01:00:00.000Z",
        "v": 16.342557100645664
      },
      {
        "d": "2018-12-08T02:00:00.000Z",
        "v": 16.361999901649803
      },
      {
        "d": "2018-12-08T03:00:00.000Z",
        "v": 16.38146827162976
      },
      {
        "d": "2018-12-08T04:00:00.000Z",
        "v": 16.400950500421814
      },
      {
        "d": "2018-12-08T05:00:00.000Z",
        "v": 16.420467555096234
      }
    ]
  },
  {
    "entity": "nurswgvml007",
    "metric": "cpu_busy",
    "tags": {},
    "type": "FORECAST",
    "transformationOrder": [
      "AGGREGATE",
      "FORECAST"
    ],
    "aggregate": {
      "type": "AVG",
      "period": {
        "count": 1,
        "unit": "HOUR",
        "align": "CALENDAR"
      },
      "interpolate": {
        "type": "LINEAR",
        "extend": false,
        "windowLength": 0
      }
    },
    "forecast": {
      "ssa": {
        "implementation": "JAVA",
        "averagingFunction": "AVG",
        "fourier": true,
        "svd": "FULL",
        "clustering": {
          "method": "HIERARCHICAL"
        },
        "groupCount": 3,
        "windowLength": 84,
        "singularValuesThreshold": -1,
        "matrixNorm": 1366.6989400912828,
        "totalEigentripleCount": 84,
        "usedEigentripleCount": 30,
        "discardedEigentripleCount": 54,
        "groupedEigentripleCount": 30,
        "maxSingularValue": 1248.3197826979504,
        "discardedSingularValue": 22.215178225695876,
        "minRetainedSingularValue": 77.62121560723442,
        "scoreStDev": 6,
        "groupingType": "AUTO_AND_STACK",
        "groupOrder": 2,
        "stack": true,
        "joinedGroups": [
          "A",
          "B"
        ],
        "eigentripleIndexes": [
          1,
          2,
          3,
          4,
          5,
          6,
          7,
          8,
          9,
          10,
          11,
          12,
          13,
          14,
          15,
          16,
          17,
          18,
          19,
          20,
          21,
          22,
          23,
          24
        ],
        "singularValues": [
          1248.3197826979504,
          146.94530499128334,
          143.92794141473993,
          134.99099344398564,
          133.7233440792101,
          131.9598700252816,
          131.15575519259218,
          124.69099819793217,
          120.63396980913338,
          117.99711659180291,
          117.3125200770626,
          116.82382176224395,
          112.31205236484509,
          110.74873548967649,
          109.1089829354941,
          107.200678759671,
          99.56623413354185,
          97.73271184511265,
          96.49813770348236,
          92.718762965264,
          89.95945733645226,
          88.04449833394744,
          78.5215925412881,
          77.62121560723442
        ]
      }
    },
    "data": [
      {
        "d": "2018-12-08T00:00:00.000Z",
        "v": 16.941437368091886
      },
      {
        "d": "2018-12-08T01:00:00.000Z",
        "v": 12.939489053146213
      },
      {
        "d": "2018-12-08T02:00:00.000Z",
        "v": 11.16216931816397
      },
      {
        "d": "2018-12-08T03:00:00.000Z",
        "v": 18.375039896564356
      },
      {
        "d": "2018-12-08T04:00:00.000Z",
        "v": 13.609606683445598
      },
      {
        "d": "2018-12-08T05:00:00.000Z",
        "v": 13.272122376879008
      }
    ]
  },
  {
    "entity": "nurswgvml007",
    "metric": "cpu_busy",
    "tags": {},
    "type": "FORECAST",
    "transformationOrder": [
      "AGGREGATE",
      "FORECAST"
    ],
    "aggregate": {
      "type": "AVG",
      "period": {
        "count": 1,
        "unit": "HOUR",
        "align": "CALENDAR"
      },
      "interpolate": {
        "type": "LINEAR",
        "extend": false,
        "windowLength": 0
      }
    },
    "forecast": {
      "ssa": {
        "implementation": "JAVA",
        "averagingFunction": "AVG",
        "fourier": true,
        "svd": "FULL",
        "clustering": {
          "method": "HIERARCHICAL"
        },
        "groupCount": 3,
        "windowLength": 84,
        "singularValuesThreshold": -1,
        "matrixNorm": 1366.6989400912828,
        "totalEigentripleCount": 84,
        "usedEigentripleCount": 30,
        "discardedEigentripleCount": 54,
        "groupedEigentripleCount": 30,
        "maxSingularValue": 1248.3197826979504,
        "discardedSingularValue": 22.215178225695876,
        "minRetainedSingularValue": 23.91003219261129,
        "scoreStDev": 6,
        "groupingType": "AUTO_AND_STACK",
        "groupOrder": 3,
        "stack": true,
        "joinedGroups": [
          "A",
          "B",
          "C"
        ],
        "eigentripleIndexes": [
          1,
          2,
          3,
          4,
          5,
          6,
          7,
          8,
          9,
          10,
          11,
          12,
          13,
          14,
          15,
          16,
          17,
          18,
          19,
          20,
          21,
          22,
          23,
          24,
          25,
          26,
          27,
          28,
          29,
          30
        ],
        "singularValues": [
          1248.3197826979504,
          146.94530499128334,
          143.92794141473993,
          134.99099344398564,
          133.7233440792101,
          131.9598700252816,
          131.15575519259218,
          124.69099819793217,
          120.63396980913338,
          117.99711659180291,
          117.3125200770626,
          116.82382176224395,
          112.31205236484509,
          110.74873548967649,
          109.1089829354941,
          107.200678759671,
          99.56623413354185,
          97.73271184511265,
          96.49813770348236,
          92.718762965264,
          89.95945733645226,
          88.04449833394744,
          78.5215925412881,
          77.62121560723442,
          26.21438894527334,
          26.08667243107496,
          25.381606345570727,
          25.1518659560464,
          24.97900929304035,
          23.91003219261129
        ]
      }
    },
    "data": [
      {
        "d": "2018-12-08T00:00:00.000Z",
        "v": 16.0076042167122
      },
      {
        "d": "2018-12-08T01:00:00.000Z",
        "v": 14.01962309351334
      },
      {
        "d": "2018-12-08T02:00:00.000Z",
        "v": 11.958413336437872
      },
      {
        "d": "2018-12-08T03:00:00.000Z",
        "v": 17.71256946367641
      },
      {
        "d": "2018-12-08T04:00:00.000Z",
        "v": 13.062727400642963
      },
      {
        "d": "2018-12-08T05:00:00.000Z",
        "v": 14.110200706263136
      }
    ]
  }
]

# Forecasting

# Overview

# Regularization

# Request Fields

# Horizon Fields

# Regularization Fields

# Control Fields

# Holt-Winters Fields

# ARIMA Fields

# Baseline Fields

# SSA Fields

# Decomposition Parameters

# Auto Grouping Parameters

# Auto Grouping Clustering Parameters

# Manual Grouping Parameters

# Reconstruction Parameters

# Forecast Parameters

# Examples

Forecasting

Overview

Regularization

Request Fields

Horizon Fields

Regularization Fields

Control Fields

Holt-Winters Fields

ARIMA Fields

Baseline Fields

SSA Fields

Decomposition Parameters

Auto Grouping Parameters

Auto Grouping Clustering Parameters

Manual Grouping Parameters

Reconstruction Parameters

Forecast Parameters

Examples