Generate Webhook when Docker Hub Build Fails

Overview

If you have images hosted on the Docker Hub registry, you need to monitor automated build jobs on Docker Hub to ensure that the images you successfully publish new images and your CI pipeline remains healthy.

While the Docker Hub provides the capability to trigger outgoing webhooks, the webhooks only execute when the automated build completes successfully. If the job fails or becomes stuck at Queued status, Docker Hub does not fire webhooks and your team remains unaware of broken builds. This limitation is known but a fix is not yet available.

While the email option serves as a workaround for build failures, do not rely on it for programmable integration with alerting and CI systems. Also, Docker Hub does not dispatch an email alert when the build remains queued for a long period of time.

This guide describes a solution, based on the Rule Engine implemented in Axibase Time Series Database (ATSD), which polls the Docker Hub build history using the Docker Hub v2 API and generates missing webhooks in case of build failures or if the build is queued for more than one hour (you can configure this threshold).

Note that this solution only applies to automated builds which are executed by Docker Hub itself.

Build Failures

Build failures come in different flavors - some are caused by human error while others occur due to infrastructure changes outside of human control. Your team needs to fix these failures regardless of the cause.

Human Error:

Infrastructure Error:

Build History

Access build history, containing success and failure statuses, under the Build Details tab in Docker Hub. History is only available for automated builds. Images that are uploaded by external tools are out of scope.

The history is also available via Docker Hub v2 API.

The build fails if the status is not 0 (Queued), 2 or 3 (Building), or 10 (Success).

In the example below, the status is -1, reported as ! Error by Docker Hub.

{
  "id": 23028946,
  "status": -1,
  "created_date": "2018-04-24T13:34:33.654759Z",
  "last_updated": "2018-04-24T13:36:39.537606Z",
  "build_code": "beedjaiuvjxk5m37uxjorja",
  "dockertag_name": "latest",
  "cause": "TRIGGERED_VIA_API"
}

Webhook Solution

The proposed solution retrieves the most recent record from the Docker Hub build history for all projects in the specified namespace and sends an HTTP notification to the consuming web service that you specify if the build status is not 2, 3 or 10, as well as when the status remains at 0 (Queued) for more than one hour.

Webhook Payload on-success

A sample request payload generated by the Docker Hub webhook when the build succeeds is shown below. These notifications continue to be generated by Docker Hub.

{
  "push_data": {
    "pushed_at": 1524572027,
    "images": [],
    "tag": "latest",
    "pusher": "axibase"
  },
  "callback_url": "https://registry.hub.docker.com/u/axibase/atsd/hook/123/",
  "repository": {
    "status": "Active",
    "description": "Axibase Time Series Database",
    "is_trusted": true,
    "full_description": "Axibase Time Series Database",
    "repo_url": "https://hub.docker.com/r/axibase/atsd",
    "owner": "axibase",
    "is_official": false,
    "is_private": false,
    "name": "atsd",
    "namespace": "axibase",
    "star_count": 5,
    "comment_count": 0,
    "date_created": 1433511877,
    "dockerfile": "REDACTED for brevity",
    "repo_name": "axibase/atsd"
  }
}

Webhook Payload on-error

A sample notification produced by an ATSD webhook when a new build failure is detected is shown below.

As you can see, the synthetic on-error payload is similar to the native on-success webhook above except the repository.status field is set to Failed (in case of build failure) or Queued (in case of hanging build). An extra build_history object is injected for additional detail.

{
  "push_data": {
    "pushed_at": 1524632779,
    "images": [],
    "tag": "latest",
    "pusher": "axibase"
  },
  "repository": {
    "status": "Failed",
    "repo_url": "https://hub.docker.com/r/axibase/atsd",
    "owner": "axibase",
    "name": "atsd",
    "namespace": "axibase",
    "repo_name": "axibase/atsd"
  },
  "build_history": {
    "id" : 23028946,
    "status" : -1,
    "created_date" : "2018-04-24T13:34:33.654759Z",
    "last_updated" : "2018-04-24T13:36:39.537606Z",
    "build_code" : "beedjaiuvjxk5m37uxjorja",
    "dockertag_name" : "latest",
    "cause" : "TRIGGERED_VIA_API"
  }
}

Launch Instructions

Set NAMESPACE variable in the command below to your Docker Hub namespace, for example:

--env NAMESPACE='google' \

Set NOTIFY_URL variable to a request URL where on-error webhook notifications may be sent, for example:

--env NOTIFY_URL='https://host01:10443/jenkins/plugin?token=123' \

The notification URL may include Basic authorization credentials, for example https://username:password@atsd_hostname:10443/. SSL certificate validation is disabled by default.

Execute the command below to launch an ATSD Sandbox container.

docker run -d -p 8443:8443 -p 9443:9443 \
  --name=atsd-sandbox \
  --env NAMESPACE='google' \
  --env NOTIFY_URL='https://webhook.site/71fd9feb-8751-4afd-9e13-16072a34b259' \
  --env ATSD_IMPORT_PATH='https://raw.githubusercontent.com/axibase/atsd-use-cases/master/integrations/docker/resources/notify.xml,https://raw.githubusercontent.com/axibase/atsd-use-cases/master/integrations/docker/resources/rule.xml' \
  --env COLLECTOR_IMPORT_PATH='https://raw.githubusercontent.com/axibase/atsd-use-cases/master/integrations/docker/resources/job.xml' \
  axibase/atsd-sandbox:latest

Watch the container start logs for All applications started.

docker logs -f atsd-sandbox

The sandbox runs two front-facing applications which may be accessed using default credentials:

  • ATSD at https://docker_host:8443
  • Axibase Collector at https://docker_host:9443

Verification

Verify that the webhook delivery is working as expected.

On Docker Hub, open the Build Settings for one of the projects (images). Trigger a build for one of the branches known to fail.

The webhook arrives in less than 5 minutes, which is the collector polling interval.

If you do not have a good failure candidate handy, send a test message command for test/my-image project as described below.

Adjust the frequency in Collector web interface at https://docker_host:8443. Open dockerhub-poller job and set the cron Expression to 0 * * * * ? to run the data collection every minute.

Automation

Retry Failed Builds

In addition to sending build error notifications, you can program the dockerhub-build-fail rule in ATSD to retry a failed build using Docker Hub Remote Build Triggers. The remote triggers allow you to initiate an automated build by sending a POST request to an endpoint on Docker Hub.

  1. Create a new trigger token for a project on Docker Hub (projects with intermittent failures are good candidates for this).

  2. Create a CUSTOM web notification to launch an automated build on Docker Hub.

  3. Add a corresponding web notification action in the dockerhub-build-fail rule.

For a more advanced implementation, create a lookup table to associate images in incoming failure events with trigger tokens.

Send Alerts

You can also customize the rule to send alerts into your preferred messaging service such as Slack or Rocket.Chat.

Troubleshooting

Axibase Collector

Log in to Collector at https://docker_host:9443 with default credentials.

Locate the dockerhub-poller job. Check that the status is Completed.

Axibase Time Series Database

Log in to ATSD at https://docker_host:8443 with default credentials.

Open the Alerts > Web Notifications page. Open the dockerhub-webhook-sender notification. Click Test to verify connection.

Open the Data > Data Entry page in the main menu. Submit this command to emulate a build failure detected by Axibase Collector.

message e:docker.hub t:build_code=abc t:last_updated=2019-01-01T00:00:00Z t:dockertag_name=latest t:name=my-image t:cause=TRIGGERED_VIA_API t:id=11111111 t:created_date=2019-01-01T00:00:00Z t:source=docker.hub t:repository=test/my-image t:type=build t:user=test t:status=-1

Check that the dockerhub-webhook-sender status is OK.

The target service now receives the JSON payload:

{
  "build_history": {
    "id": 23028946,
    "status": -1,
    "created_date": "2018-04-24T13:34:33.654759Z",
    "last_updated": "2018-04-24T13:36:39.537606Z",
    "build_code": "abc",
    "dockertag_name": "latest",
    "cause": "TRIGGERED_VIA_API"
  },
  "push_data": {
    "pushed_at": 1524741369,
    "images": [],
    "tag": "latest",
    "pusher": "test"
  },
  "repository": {
    "status": "Failed",
    "repo_url": "https://hub.docker.com/r/test/my-image",
    "owner": "test",
    "name": "my-image",
    "namespace": "test",
    "repo_name": "test/my-image"
  }
}

References