Time Series Socrata Datasets

Overview

The list contains U.S. Government open data datasets with a datetime dimension.

The total number of entries is 7,190 out of 280,000 available in the catalog and includes only datasets published in the Socrata format.

The listing is grouped by reporting agency (host), category, and catalog tag:

Analyzing Data

To analyze a dataset with SQL queries you can automatically load it into Axibase Time Series Database (ATSD), a non-relational database optimized for storing and analyzing time-series data.

Install Docker.

Download the docker-compose.yml file.

curl -o docker-compose.yml \
  https://raw.githubusercontent.com/axibase/open-data-catalog/master/resources/docker-compose.yml

Launch ATSD and Axibase Collector containers and specify the URL to the dataset in the DATASET_URL variable.

For this walk-through, the dataset is Seattle City Budget and the URL is provided in the Data: JSON field.

https://data.seattle.gov/api/views/55z8-f4gi/rows.json

Remove the ?max_rows=100 parameter from the URL to load the entire dataset.

export DATASET_URL=https://data.seattle.gov/api/views/55z8-f4gi/rows.json; \
docker-compose up -d

Watch for ATSD start completed message in the start log.

docker logs -f atsd
[ATSD] ATSD user interface:
[ATSD] http://172.18.0.2:8088
[ATSD] https://172.18.0.2:8443
[ATSD] ATSD start completed. Time: 2018-06-25 11-03-12.
[ATSD] Collector account 'myuser' created. Type: 'api-rw'.

Login into ATSD web interface on port 8443 (https).

Open Entities tab in the main menu, locate the 55z8-f4gi entity which refers to dataset identifier in the DATASET_URL variable and click Metrics to view a list of metrics collected in this dataset.

Choose one of the available metrics and click Series icon to open a list of all series collected for this entity and metric.

Select a series and open the Series Statistics page. The page contains summary information about the series including metadata published by the reporting agency.

Click SQL to open the SQL Console with a sample pre-generated query that selects last 100 values for the given series.

Customize the query to produce an analytical report, for example:

SELECT tags."department", SUM(value)/1000000 AS "Total, $M"
  FROM "proposed_2011_expenditure_allowance"
 WHERE entity = '55z8-f4gi'
GROUP BY tags."department"
  ORDER BY SUM(value) DESC

Review the results which can be also exported in CSV and Excel formats.

| tags.department                       | Total, $M |
|---------------------------------------|-----------|
| Seattle City Light                    | 1087.5    |
| Seattle Public Utilities              | 823.9     |
| Finance General                       | 358.9     |
| Seattle Department of Transportation  | 322.0     |
| Seattle Police Department             | 248.5     |

Review SQL syntax and SQL examples in ATSD documentation for insights and ideas on how to analyze the data.