Google cAdvisor

Axibase Time Series Database collects Docker container performance metrics through Google cAdvisor (Container Advisor) for long-term retention, analytics and visualization. A single ATSD instance can collect metrics from multiple Docker hosts and cAdvisors instances.

Native Docker API Monitoring is now available.

In a basic configuration cAdvisor monitors all running containers on the Docker host. Container statistics are sent over TCP protocol to the ATSD container installed on the same host. When a new container is launched it will be automatically discovered by cAdvisor and its statistics will be continuously sent into ATSD while the container is running.

In an advanced configuration, multiple cAdvisor instances can be configured to send container statistics to a centralized ATSD installation. ATSD will store metrics from local and remote Docker hosts for consolidated reporting and analytics. This type of configuration is suited for centralized workload planning, capacity planning and performance monitoring.

distributed_docker_infrastructure

Launch ATSD container as the back-end for cAdvisor

Installation steps are described here.

In-container Monitoring

In addition to container statistics received from Docker hosts we recommend enabling collectd, tcollector or nmon daemons on key containers for an in-depth view into their system performance. All of the above collection daemons and protocols are supported by ATSD. Similarly, install collectd, tcollector or nmon on Docker hosts as well.

Built-in entity groups and portals

Default visualization portals for cAdvisor entities are included in ATSD.

Default cAdvisor portal names:

  • Cadvisor Overview
  • Cadvisor Disk Detail
  • Cadvisor Host
  • Cadvisor Multi-Host

Using the built-in Overview, Disk Detail, Host and Multi-Host visualization portals, you can quickly identify bottlenecks in your microservices infrastructure

Entities collecting cAdvisor data are automatically grouped in ATSD.

cAdvisor entity group names:

  • cadvisor-container
  • cadvisor-hosts
cAdvisor Overview Portal

Launch live cAdvisor Overview Portal in Axibase Chart Lab.

cAdvisor Disk Detail Portal

Launch live cAdvisor Disk Detail Portal in Axibase Chart Lab.

cAdvisor Host Portal

Launch live cAdvisor Host Portal in Axibase Chart Lab.

cAdvisor Mutli-Host Portal

Launch live cAdvisor Multi-Host Portal in Axibase Chart Lab.

Collected cAdvisor Metrics

CPU Metrics

CPU metrics will be found in the cpuacct controller. CPU usages is generated by the processes of the container, broken down between user and system time. User is the time during which the processes were in direct control of the CPU, and system is the time during which the CPU was executing system calls on behalf of those processes. Those times are expressed in ticks of 1/100th of a second.

cpu.loadaverage
cpu.loadaverage%
cpu.usage.percpu
cpu.usage.percpu%
cpu.usage.system
cpu.usage.system%
cpu.usage.total
cpu.usage.total%
cpu.usage.user
cpu.usage.user%
cpu.host.usage.system%
cpu.host.usage.total%
cpu.host.usage.user%
I/O Metrics

I/O is accounted in the blkio controller.

io_service_bytes – indicates the number of bytes read and written by the cgroup. It has 4 counters per device, because for each device, it differentiates between synchronous vs. asynchronous I/O, and reads vs. writes.

io_serviced – the number of I/O operations performed, regardless of their size. It also has 4 counters per device.

diskio.ioservicebytes.async
diskio.ioservicebytes.read
diskio.ioservicebytes.sync
diskio.ioservicebytes.total
diskio.ioservicebytes.write
diskio.ioserviced.async
diskio.ioserviced.read
diskio.ioserviced.sync
diskio.ioserviced.total
diskio.ioserviced.write
Memory Metrics

Memory metrics are found in the “memory” cgroup.

pgfault and pgmajfault – indicate the number of times that a process of the cgroup triggered a page fault and a major fault. A page fault happens when a process accesses a part of its virtual memory space which is nonexistent or protected.

memory.usage – the amount of all used memory, regardless of when it was accessed.

memory.usage – the amount of memory that a processes require in a given time interval. This includes recently accessed memory, dirty memory and kernel memory.

memory.containerdata.pgfault
memory.containerdata.pgmajfault
memory.hierarchicaldata.pgfault
memory.hierarchicaldata.pgmajfault
memory.usage
memory.workingset
cadvisor.memory.cache
cadvisor.memory.rss
Network Metrics

Network metrics track the about of packets received/sent, amount of traffic in bytes, dropped packets and errors.

network.rxbytes – cumulative count of bytes received.
network.rxpackets – cumulative count of packets received.
network.rxerrors – cumulative count of receive errors encountered.
network.rxdropped – cumulative count of packets dropped while receiving.
network.txbytes – cumulative count of bytes transmitted.
network.txpackets – cumulative count of packets transmitted.
network.txerrors – cumulative count of transmit errors encountered.
network.txdropped – Cumulative count of packets dropped while transmitting.

network.rxbytes
network.rxdropped
network.rxerrors
network.rxpackets
network.txbytes
network.txdropped
network.txerrors
network.txpackets
Task Metrics

taskstats.nriowait – number of tasks waiting on I/O.
taskstats.nrrunning – number of running tasks.
taskstats.nrsleeping – number of sleeping tasks.
taskstats.nrstopped – number of tasks in stopped staten
taskstats.nruninterruptible – number of tasks in uninterruptible state.

taskstats.nriowait
taskstats.nrrunning
taskstats.nrsleeping
taskstats.nrstopped
taskstats.nruninterruptible
File System Metrics

File system metrics track the read and write processes for attached file systems.

filesystem.available
filesystem.ioinprogress
filesystem.iotime
filesystem.limit
filesystem.readscompleted
filesystem.readsmerged
filesystem.readtime
filesystem.sectorsread
filesystem.sectorswritten
filesystem.usage
filesystem.weightediotime
filesystem.writescompleted
filesystem.writesmerged
filesystem.writetime
cadvisor.filesystem.baseusage
cadvisor.filesystem.inodesfree

NOTE: disk metrics and file system metrics are only collected from containers that have attached volumes.