coredns

Overview

CoreDNS mixin provides Grafana dashboard and Prometheus Alerts to monitor CoreDNS. The mixin was introduced in Kubernetes Node Local DNS Cache blogpost to better help users monitor CoreDNS in Kubernetes. Mixin can also be used to monitor standalone CoreDNS instance without any orchestrators.

Jsonnet source code is available at github.com/povilasv/coredns-mixin

Alerts

Complete list of pregenerated alerts is available here.

coredns

CoreDNSDown

https://github.com/povilasv/coredns-mixin/tree/master/runbook.md#alert-name-corednsdown


alert: CoreDNSDown
annotations:
  message: CoreDNS has disappeared from Prometheus target discovery.
  runbook_url: https://github.com/povilasv/coredns-mixin/tree/master/runbook.md#alert-name-corednsdown
expr: |
  absent(up{job="kube-dns"} == 1)
for: 15m
labels:
  severity: critical

CoreDNSLatencyHigh

https://github.com/povilasv/coredns-mixin/tree/master/runbook.md#alert-name-corednslatencyhigh


alert: CoreDNSLatencyHigh
annotations:
  message: CoreDNS has 99th percentile latency of {{ $value }} seconds for server
    {{ $labels.server }} zone {{ $labels.zone }} .
  runbook_url: https://github.com/povilasv/coredns-mixin/tree/master/runbook.md#alert-name-corednslatencyhigh
expr: |
  histogram_quantile(0.99, sum(rate(coredns_dns_request_duration_seconds_bucket{job="kube-dns"}[5m])) by(server, zone, le)) > 4
for: 10m
labels:
  severity: critical

CoreDNSErrorsHigh

https://github.com/povilasv/coredns-mixin/tree/master/runbook.md#alert-name-corednserrorshigh


alert: CoreDNSErrorsHigh
annotations:
  message: CoreDNS is returning SERVFAIL for {{ $value | humanizePercentage }} of
    requests.
  runbook_url: https://github.com/povilasv/coredns-mixin/tree/master/runbook.md#alert-name-corednserrorshigh
expr: |
  sum(rate(coredns_dns_response_rcode_count_total{job="kube-dns",rcode="SERVFAIL"}[5m]))
    /
  sum(rate(coredns_dns_response_rcode_count_total{job="kube-dns"}[5m])) > 0.03
for: 10m
labels:
  severity: critical

CoreDNSErrorsHigh

https://github.com/povilasv/coredns-mixin/tree/master/runbook.md#alert-name-corednserrorshigh


alert: CoreDNSErrorsHigh
annotations:
  message: CoreDNS is returning SERVFAIL for {{ $value | humanizePercentage }} of
    requests.
  runbook_url: https://github.com/povilasv/coredns-mixin/tree/master/runbook.md#alert-name-corednserrorshigh
expr: |
  sum(rate(coredns_dns_response_rcode_count_total{job="kube-dns",rcode="SERVFAIL"}[5m]))
    /
  sum(rate(coredns_dns_response_rcode_count_total{job="kube-dns"}[5m])) > 0.01
for: 10m
labels:
  severity: warning

coredns_forward

CoreDNSForwardLatencyHigh

https://github.com/povilasv/coredns-mixin/tree/master/runbook.md#alert-name-corednsforwardlatencyhigh


alert: CoreDNSForwardLatencyHigh
annotations:
  message: CoreDNS has 99th percentile latency of {{ $value }} seconds forwarding
    requests to {{ $labels.to }}.
  runbook_url: https://github.com/povilasv/coredns-mixin/tree/master/runbook.md#alert-name-corednsforwardlatencyhigh
expr: |
  histogram_quantile(0.99, sum(rate(coredns_forward_request_duration_seconds_bucket{job="kube-dns"}[5m])) by(to, le)) > 4
for: 10m
labels:
  severity: critical

CoreDNSForwardErrorsHigh

https://github.com/povilasv/coredns-mixin/tree/master/runbook.md#alert-name-corednsforwarderrorshigh


alert: CoreDNSForwardErrorsHigh
annotations:
  message: CoreDNS is returning SERVFAIL for {{ $value | humanizePercentage }} of
    forward requests to {{ $labels.to }}.
  runbook_url: https://github.com/povilasv/coredns-mixin/tree/master/runbook.md#alert-name-corednsforwarderrorshigh
expr: |
  sum(rate(coredns_forward_response_rcode_count_total{job="kube-dns",rcode="SERVFAIL"}[5m]))
    /
  sum(rate(coredns_forward_response_rcode_count_total{job="kube-dns"}[5m])) > 0.03
for: 10m
labels:
  severity: critical

CoreDNSForwardErrorsHigh

https://github.com/povilasv/coredns-mixin/tree/master/runbook.md#alert-name-corednsforwarderrorshigh


alert: CoreDNSForwardErrorsHigh
annotations:
  message: CoreDNS is returning SERVFAIL for {{ $value | humanizePercentage }} of
    forward requests to {{ $labels.to }}.
  runbook_url: https://github.com/povilasv/coredns-mixin/tree/master/runbook.md#alert-name-corednsforwarderrorshigh
expr: |
  sum(rate(coredns_dns_response_rcode_count_total{job="kube-dns",rcode="SERVFAIL"}[5m]))
    /
  sum(rate(coredns_dns_response_rcode_count_total{job="kube-dns"}[5m])) > 0.01
for: 10m
labels:
  severity: warning

Dashboards

Following dashboards are generated from mixins and hosted on github: