Data Forecasting
Overview
ATSD includes a range of univariate forecasting algorithms that predict future series values based on historical data.
Supported algorithms:
- Holt-Winters
- ARIMA
- Singular Spectrum Analysis
The accuracy of predictions depends on the frequency of data collection, the selection interval, and the algorithm parameters.
Forecasting Example with Abnormal Deviation:
Reference
Editor Settings
Enable data forecasting on the Data > Forecasts page.
General Settings
Enabled forecasts are prepared by background jobs on schedule according to cron
expressions. Forecasting jobs are typically executed during off-peak hours.
Setting | Description |
---|---|
Retention Interval | Specifies how long a forecast is stored in the database. Forecasts that are older than current time , or End Time, if specified, minus Retention Interval are deleted. |
Data Selection Settings
Setting | Description |
---|---|
Metric | Metric name for which forecasts are calculated. |
Entity | If selected, forecasts are calculated for the specified entity. Supersedes Entity Group drop-down list. If neither entity nor entity group is specified, forecasts are prepared for all entities. |
Entity Group | If selected, forecasts are calculated for entities contained in the specified entity group. |
Tags | Prepare forecasts only for series containing the specified series tags. |
End Time | End time of the Data Selection Interval and Series Selection Interval. This field supports calendar expressions, for example current_day .If not defined, the field is set to the time the job is run. |
Data Selection Interval | Time frame for selecting detailed data used as forecast input. Specify the end of the interval in the End Time field, otherwise the end of the selection interval is set to current time. |
Series Selection Interval | Ignore any series with Last Insert Time before End Time by more than the specified interval.Use this option to ignore series which have not been updated for a long time. |
Calendar | Ignore detailed values within the time intervals listed in the calendar. |
Empty Period Threshold | Ignore a series with a percentage of empty periods greater than the specified threshold. Calculated as 100 * (number of empty periods before interpolation)/(total number of aggregation periods in Data Selection Interval) . |
For data exclusion options, see Calendar Exception Settings.
Aggregation Settings
Setting | Description |
---|---|
Group By | Grouping key for merging multiple series into one. Detailed data for multiple series sharing the same grouping key are merged into one array prior to computing aggregate statistics. |
Auto Aggregate | The server automatically calculates an aggregation period that produces the most accurate forecast, defined as having the lowest variance from observed historical data. |
Aggregation Period | Period of time over which the detailed samples are aggregated. |
Aggregate Statistic | Aggregation function applied to raw data to regularize the series. Aggregate values for empty periods without detailed data are interpolated as values of aggregate functions for previous periods. |
Algorithm Parameters
Setting | Description |
---|---|
Algorithm | Holt-Winters or ARIMA forecasting algorithms. |
Score Interval | Part of Data Selection Interval that is used to compute variance between observed values and forecast to rank forecasts by variance. The shorter the Score Interval, the more weight is assigned to recently observed values. |
Auto Period | The server automatically calculates seasonality of the underlying series that produces the most accurate forecast, defined as having the lowest variance from observed historical data. |
Period | Specify seasonality of the underlying series. |
Auto Parameters | The server automatically calculates algorithm parameters that produce the most accurate forecast, defined as having the lowest variance from observed historical data. |
Persistence Settings
Setting | Description |
---|---|
Forecast Name | Optional name used to differentiate between several forecasts for one underlying series. Use cases: forecastName field in Data APIforecast(name) Rule Engine functionforecast-name Chart setting |
Default Forecast | Enable these settings instead of default settings when calculating on-demand forecast. On-demand forecast is calculated at request time if a pre-stored forecast is not available. |
Forecast Range | Minimum and Maximum constraints applied to the stored forecast values to ensure that such values are within the specified range. Constraints are applied to the winning forecast after scoring stage. |
Forecast Interval | The length of time into the future for which forecasts are prepared and stored in the database. Can be rounded upwards to the nearest forecast period. |
Editor Tools
Editor Tools are located at the bottom of the Forecast Configuration page.
Tool | Description | Example |
---|---|---|
Calculate Parameters | Click Calculate Parameters to calculate algorithm parameters. | |
Run | Execute the forecast job. Use this tool to test a forecast. The response contains the number of calculated forecasts. | |
Export | Export forecast data in CSV format. | |
Show Meta | Display parameters used to calculate the forecast. Metadata is stored with the forecast. Collection interval is an interval within the real data extracted to build the forecast. |
Use the split-button on the Data > Forecasts page to add Exceptions and perform Testing.
Using Forecasts
Rule Engine
- Access pre-computed forecasts using forecast functions.
- Use forecast values as thresholds to trigger response actions if observed values deviate from forecast values by some amount.
- Compare forecast values to simple, weighted, and exponential moving averages.
abs(avg() - forecast().interpolated) > 25
This expression compares the moving average of some metric to the forecast and alerts if the absolute difference exceeds 25.
thresholdTime(null, 90, '1 DAY') != null
thresholdTime
function returns Unix time in milliseconds when the forecast is expected to exceed90
for the first time.- The condition becomes
true
if the expected violation is to occur in less than 1 day.
Ad hoc Export
Set Data Type setting to Forecast, optionally specify a forecast name:
Data API
Query and insert forecast values with Data API. The insert
capability can be used to populate the database with custom forecast values calculated externally.
A sample forecast JSON query:
[
{
"entity": "nurswgvml007",
"metric": "cpu_busy",
"type": "FORECAST",
"endDate": "now + 2 * hour",
"startDate": "now "
}
]
Open collapsed menu to view response.
[
{
"entity": "nurswgvml007",
"metric": "cpu_busy",
"tags": {},
"type": "FORECAST",
"aggregate": {
"type": "DETAIL"
},
"meta": {
"timestamp": "2018-05-15T08:20:00.000Z",
"averagingInterval": 600000,
"alpha": 0,
"beta": 0.4,
"gamma": 0.4,
"period": {
"count": 1,
"unit": "DAY"
},
"stdDev": 7.224603272075089
},
"data": [
{"d":"2018-05-15T08:20:00.000Z","v":11.604692968987015},
{"d":"2018-05-15T08:30:00.000Z","v":14.052095586152106},
{"d":"2018-05-15T08:40:00.000Z","v":15.715682104344845},
{"d":"2018-05-15T08:50:00.000Z","v":11.604018743609409},
{"d":"2018-05-15T09:00:00.000Z","v":12.507966355503251},
{"d":"2018-05-15T09:10:00.000Z","v":12.59619153186056},
{"d":"2018-05-15T09:20:00.000Z","v":11.092825413101579},
{"d":"2018-05-15T09:30:00.000Z","v":11.747112803805937},
{"d":"2018-05-15T09:40:00.000Z","v":11.137962830355074},
{"d":"2018-05-15T09:50:00.000Z","v":11.40358025413789},
{"d":"2018-05-15T10:00:00.000Z","v":16.728103701429056},
{"d":"2018-05-15T10:10:00.000Z","v":12.75646043607565}
]
}
]
Insert a forecast into ATSD using POST
method:
POST /api/v1/series/insert
Payload:
[
{
"entity": "nurswgvml007",
"metric": "mpstat.cpu_busy",
"type": "FORECAST",
"data": [
{
"t": 1462427358127,
"v": 52
}
]
}
]
Additional examples:
Charts
Load forecasts into charts by setting data-type = forecast
in the [series]
section.
[series]
entity = nurswgvml007
metric = cpu_busy
data-type = forecast
List of widget and series settings applicable to forecast data: