Wrangling Federal Reserve Economic Data with SQL and Declarative Graphics
The Federal Reserve Economic Research Division(FRED) of the St. Louis Federal Reserve publishes publicly available data on a range of topics related to macroeconomic trends such as GDP, employment and other national statistics.
This article focuses on the
AD01RC1Q027SBEA series which tracks net lending and borrowing of the United States Government.
Handling Special Items
AD01RC1Q027SBEA series is annualized and each quarterly value is therefore multiplied by
4 to arrive at an annual estimate. This calculation is used to show the annual total if the trends of a specific quarter are replicated over the course of the year. During the final quarter of 2017, a potential
$250 billion windfall from one-time corporate repatriation taxes is added to the total by virtue of annualized calculation. As a result, the
$250 billion extra quarterly income is translated into
$1 trillion after transformation. The original FRED blog post discussing this data includes a visualization which considers this annualized value:
As a result of this transformation, the annual budget of the U.S. government is estimated to be a positive
$14.68 million, an accomplishment last reported in 2001.
This article illustrates how SQL and the declarative graphics library in ATSD can be utilized to perform ad-hoc data transformations such as the removal of special items.
The original data and new data are shown together. The range of conclusions one can draw from these three series are vastly different.
The special item is removed using a
# remove extraordinary item completely. replace-value = time == new Date('2017-10-01T00:00:00Z').getTime() ? value-1000 : value
This setting targets a specific date, and evaluates an
if-else expression which subtracts
$1 trillion or
$750 billion from the value of the defined date.
group-statistic settings are used to return non-annualized data, which still includes the
$250 billion extraordinary item, but does not include the additional
$750 billion from annualization.
# remove annualization factor from raw data, group by annual period, sum samples for actual results. replace-value = value/4 group-period = 1 year group-statistic = sum
The data is presented via the Trends service.
Querying FRED Data with SQL
Similar to the above visualizations, data cleanup may be performed with SQL.
By de-annualizing quarterly values and aggregating them back into annual totals using date aggregation in ATSD SQL, the effect of the phantom
$750B is removed from the annual series.
SELECT date_format(time, 'yyyy') "Year", SUM(value/4) "Net Lending/Borrowing" FROM "ad01rc1q027sbea" GROUP BY period(1 year) ORDER BY datetime DESC
The ten most recent years of federal government lending / borrowing:
As a result, the estimated annual budget balance is now a deficit of
$685 billion, a number that is materially different from the estimated surplus of
The above query may be executed in the ATSD web console or using the ATSD API Client for Python, where the data may be queried using SQL and converted to
pandas dataframes for further analysis.
>>> q = """ ... SELECT date_format(time, 'yyyy') "Year", SUM(value/4) "Net Lending/Borrowing" ... FROM "ad01rc1q027sbea" ... GROUP BY period(1 year) ... ORDER BY datetime DESC ... """
View the complete result set here:
To show only years where the United States had an annual lending surplus, execute this query with the
SELECT date_format(time, 'yyyy') "year", SUM(value)/4 "surplus" FROM "ad01rc1q027sbea" WHERE datetime >= '1970' GROUP BY period(1 year) HAVING SUM(value) > 0
The result set shows only one year since 1970 when the United States achieved a net lending surplus:
Although in the FRED visualization it appeared that the United States government has finally achieved a budget surplus, in fact the nature of the data is such that it only seems that way. The dataset here is annualized, meaning that quarterly data is plotted as if the trend remains constant for the entire year. Thus, the
$250 billion tax relief created by the current administration is considered as
$1 trillion due to annualization calculations.
The special item can be completely removed from the series using the
SELECT date_format(time, 'yyyy') "Year", CASE WHEN datetime = '2017-01-01' THEN SUM(value/4) - 250 -- alternatively THEN (SUM(value)-1000)/4 ELSE SUM(value/4) END AS "Annual Lending / Borrowing" FROM "ad01rc1q027sbea" GROUP BY period(1 year) ORDER BY datetime DESC
The result set from 2007 onward:
|Year||Annual Lending / Borrowing|
The dataset used for this article is stored in the Trends instance of ATSD.
If you would like read-only credentials to the database to recreate the queries shown here, test drive the ATSD API Client for Python, or query any of the other datasets stored there, reach out to us, we're happy to provide them.