CSV

CSV files can be uploaded into Axibase Time-Series Database via HTTP API or manually through the user interface.

Uploaded CSV files are processed by a user-defined CSV parser which converts the text into a tabular model and creates series, properties and message commands from cell values depending on column name.

In addition to column-based parsing, ATSD supports schema-based parsing using RFC 7111 selections:

select("#row=2-*").select("#col=3-*").
addSeries().
metric(cell(1, col)).
entity(cell(row, 2)).
tag(cell(1, 3),cell(row, 3)).
timestamp(cell(row, 1));

Produced series commands:

series e:nurswgvml007 d:2015-11-15T00:00:00Z m:space_used_%=72.2 t:disk=/dev/sda
series e:nurswgvml007 d:2015-11-15T00:00:00Z m:space_used_%=14.5 t:disk=/dev/sdb
series e:nurswgvml001 d:2015-11-15T00:00:00Z m:space_used_%=14.4 t:disk=/dev/sda1

Please see our CSV page on GitHub for additional technical details.

Configuration Settings

SettingDescription
EnabledParser status. If parser is disabled, uploaded CSV files referencing this parser will be discarded.
NameUnique parser name used as identifier when uploading files.
Command TypeType of data contained in the file: time series, properties, messages.
Write PropertyEnable writing data both as series and as properties.
Entity ColumnName of column in csv file containing the entities, for example: host or node.
Multiple columns can be specified in Entity Column field in order to concatenate their values into a composite entity name using dash symbol – as a token.
For example:
Souce CSV file:
Year,Source,Destination,Travelers
1995,Moscow,Berlin,2000000
Entity Columns:
Source,Destination
Resulting Entity:
Moscow-Berlin
Entity PrefixPrefix added to entity names.
Default EntityAll data written to specific entity
Replace EntitiesReplace entity names in the input file with their aliases from the selected Replacement Table.
For example if Replacement Table contains a mapping 103323213=sensor001 and the entity of the CSV file is named 103323213 then it will be saved in ATSD as sensor001.
Process EventsProcess incoming data in the Rule Engine in addition to storing it in the database.
Metric PrefixPrefix added to metric names.
Metric Name ColumnColumn containing metric names
Metric Value ColumnColumn containing metric values
Message ColumnColumn containing message text
Timestamp ColumnsColumns containing the timestamp for each data sample. In some cases, depending on the CSV file, may be split into multiple columns, for example: Date, Time
If there are two columns containing the Timestamp, then they are concatenated with a dash symbol (-) in Timestamp Pattern field.
For example:
Source CSV File:
Date,Time,Sensor,Power
2015.05.15,09:15:00,sensor01,15
Timestamp Columns:
Date,Time
Result:
Date-Time
2015.05.15-09:15:00
Timestamp Pattern Setting:
yyyy.MM.dd-HH:mm:ss
Timestamp TypePattern, Seconds (Unix Seconds), Milliseconds (Unix Milliseconds)
Predefined PatternPredefined Timestamp formats
Timestamp PatternCustom timestamp format, specified manually, for example: dd-MMM-yy HH:mm:ss
If there are two columns containing the Timestamp, then in they are divided with a dash (-) in the pattern.
Timezone Diff ColumnColumn containing the time difference calculated from UTC
Time ZoneTime zone for interpreting timestamps
FilterExpression applied to row. If expression returns false, the row is discarded.

Filter syntax:

Fields:
timestamp – timestamp in milliseconds. Computed by parsing date from Time Column with specified Time Format and converted into milliseconds.
row[‘columnName’] – text value of cell in columnName column

Functions:
number(‘columnName’) – returns numeric value of columnName cell, or NaN (Not a Number) if the cell contains unparsable text.
isNumber(‘columnName’) – returns true if columnName cell is a valid number
isBlank(‘columnName’) – returns true is columnName cell is empty string
upper(‘columnName’) – converts columnName cell text to uppercase
lower(‘columnName’) – converts columnName cell text to lowercase
date(‘endtime expression’) – returns time in milliseconds

Filter examples:
number(‘columnName’) > 0
isNumber(‘columnName’)
row[‘columnName’] like ‘abc*’
upper(‘columnName’) != ‘INVALID’
timestamp > date(‘current_day’)
timestamp > date(‘2015-08-15T00:00:00Z’)
timestamp > date(‘now – 2 * year’)

Tag ColumnsColumns converted to series tags
Default TagsPredefined series tags, specified as name=value on multiple lines.
Ignored ColumnsList of columns ignored in METRIC and MESSAGE commands.
These columns are retained in PROPERTY commands.
Renamed ColumnsList of column names to substitute input column headers, one mapping per line.
Usage: inputname=storedname
HeaderHeader to be used if the file contains no header or to replace existing header.
SchemaSchema defines how to process cells based on their position.

Columns contained in the CSV file that are not specified in any field in the parser will be imported as metrics.

Parse Settings

SettingDescription
DelimiterSeparator dividing values: comma, semicolon, or tab.
Line DelimiterEnd-of-line symbol: EOL (\\n, \\rn) or semicolon ;
Text QualifierEscape character to differentiate separator as literal value.
Comment SymbolLines starting with comment symbol such as hash # are ignored.
Padding SymbolSymbol appended to text values until all cells in the given column have the same width.
Applies to fixed-length formats such as .dat format.
Decimal SeparatorSymbol used to mark the border between the integral and the fractional parts of a decimal numeral.
Default value: comma
Possible values: period, comma
Grouping SeparatorSymbol used to group thousands within the number.
Default value: none
Possible values: none, period, comma, space
Fields LengthsWidth of columns in symbols. Padding symbols added to the end of field to obey the fields lengths.
For files in dat format.
Discard NaNNaN (Not a Number) values will be discarded
Ignore Line ErrorsIf enabled, any errors while parsing the given line are ignored, including date parse errors, number parse errors, split errors, mismatch of row and header column counts.