Skip to main content

Kubernetes cluster state monitoring with Netdata

Kubernetes is an open-source container orchestration system for automating software deployment, scaling, and management.

This module collects health metrics for the following Kubernetes resources:

Requirements

  • Only works when Netdata is running inside a Kubernetes cluster.
  • RBAC: needs list, watch verbs for pod and node resources.
  • RBAC: needs get verb for namespace resource.

Metrics

All metrics have "k8s_state." prefix.

Node

MetricDimensionsUnits
node_allocatable_cpu_requests_utilizationrequests%
node_allocatable_cpu_requests_usedrequestsmillicpu
node_allocatable_cpu_limits_utilizationlimits%
node_allocatable_cpu_limits_usedlimitsmillicpu
node_allocatable_mem_requests_utilizationrequests%
node_allocatable_mem_requests_usedrequestsbytes
node_allocatable_mem_limits_utilizationlimits%
node_allocatable_mem_limits_usedlimitsbytes
node_allocatable_pods_utilizationallocated%
node_allocatable_pods_usageavailable, allocatedpods
node_conditionadded dynamicallystatus
node_schedulabilityschedulable, unschedulablestate
node_pods_readinessready%
node_pods_readiness_stateready, unreadypods
node_pods_conditionpod_ready, pod_scheduled,
pod_initialized, containers_ready
pods
node_pods_phaserunning, failed, succeeded, pendingpods
node_containerscontainers, init_containerscontainers
node_containers_staterunning, waiting, terminatedcontainers
node_init_containers_staterunning, waiting, terminatedcontainers
node_ageageseconds

Pod

MetricDimensionsUnits
pod_cpu_requests_usedrequestsmillicpu
pod_cpu_limits_usedlimitsmillicpu
pod_mem_requests_usedrequestsbytes
pod_mem_limits_usedlimitsbytes
pod_conditionpod_ready, pod_scheduled,
pod_initialized, containers_ready
state
pod_phaserunning, failed, succeeded, pendingstate
pod_ageageseconds
pod_containerscontainers, init_containerscontainers
pod_containers_staterunning, waiting, terminatedcontainers
pod_init_containers_staterunning, waiting, terminatedcontainers

Pod container

MetricDimensionsUnits
pod_container_readiness_statereadystate
pod_container_restartsrestartsrestarts/s
pod_container_staterunning, waiting, terminatedstate
pod_container_waiting_state_reasonadded dynamicallystate
pod_container_terminated_state_reasonadded dynamicallystate

Labels

  • 'k8s_cluster_id' value is 'kube-system' namespace UID.
  • 'k8s_cluster_name' currently only appears when running on GKE.
LabelNodePodContainer
k8s_kindyesyesyes
k8s_cluster_idyesyesyes
k8s_cluster_nameyesyesyes
k8s_node_nameyesyesyes
k8s_namespaceyesyes
k8s_controller_kindyesyes
k8s_controller_nameyesyes
k8s_pod_uidyesyes
k8s_pod_nameyesyes
k8s_qos_classyesyes
k8s_container_idyes
k8s_container_nameyes

Configuration

No configuration is needed. This module is enabled when you install Netdata using netdata/helmchart.

Troubleshooting

To troubleshoot issues with the k8s_state collector, run the go.d.plugin with the debug option enabled. The output should give you clues as to why the collector isn't working.

  • Navigate to the plugins.d directory, usually at /usr/libexec/netdata/plugins.d/. If that's not the case on your system, open netdata.conf and look for the plugins setting under [directories].

    cd /usr/libexec/netdata/plugins.d/
  • Switch to the netdata user.

    sudo -u netdata -s
  • Run the go.d.plugin to debug the collector:

    ./go.d.plugin -d -m k8s_state

Was this page helpful?

Contribute