Skip to main content

Kubernetes cluster state monitoring with Netdata

Kubernetes is an open-source container orchestration system for automating software deployment, scaling, and management.

This module collects health metrics for the following Kubernetes resources:

Requirements

  • Only works when Netdata is running inside a Kubernetes cluster.
  • RBAC: needs list, watch verbs for pod and node resources.
  • RBAC: needs get verb for namespace resource.

Metrics

All metrics have "k8s_state." prefix.

Node

MetricDimensionsUnits
node_allocatable_cpu_requests_utilizationrequests%
node_allocatable_cpu_requests_usedrequestsmillicpu
node_allocatable_cpu_limits_utilizationlimits%
node_allocatable_cpu_limits_usedlimitsmillicpu
node_allocatable_mem_requests_utilizationrequests%
node_allocatable_mem_requests_usedrequestsbytes
node_allocatable_mem_limits_utilizationlimits%
node_allocatable_mem_limits_usedlimitsbytes
node_allocatable_pods_utilizationallocated%
node_allocatable_pods_usageavailable, allocatedpods
node_conditionadded dynamicallystatus
node_schedulabilityschedulable, unschedulablestate
node_pods_readinessready%
node_pods_readiness_stateready, unreadypods
node_pods_conditionpod_ready, pod_scheduled,
pod_initialized, containers_ready
pods
node_pods_phaserunning, failed, succeeded, pendingpods
node_containerscontainers, init_containerscontainers
node_containers_staterunning, waiting, terminatedcontainers
node_init_containers_staterunning, waiting, terminatedcontainers
node_ageageseconds

Pod

MetricDimensionsUnits
pod_cpu_requests_usedrequestsmillicpu
pod_cpu_limits_usedlimitsmillicpu
pod_mem_requests_usedrequestsbytes
pod_mem_limits_usedlimitsbytes
pod_conditionpod_ready, pod_scheduled,
pod_initialized, containers_ready
state
pod_phaserunning, failed, succeeded, pendingstate
pod_ageageseconds
pod_containerscontainers, init_containerscontainers
pod_containers_staterunning, waiting, terminatedcontainers
pod_init_containers_staterunning, waiting, terminatedcontainers

Pod container

MetricDimensionsUnits
pod_container_readiness_statereadystate
pod_container_restartsrestartsrestarts/s
pod_container_staterunning, waiting, terminatedstate
pod_container_waiting_state_reasonadded dynamicallystate
pod_container_terminated_state_reasonadded dynamicallystate

Labels

  • 'k8s_cluster_id' value is 'kube-system' namespace UID.
  • 'k8s_cluster_name' currently only appears when running on GKE.
LabelNodePodContainer
k8s_kindyesyesyes
k8s_cluster_idyesyesyes
k8s_cluster_nameyesyesyes
k8s_node_nameyesyesyes
k8s_namespaceyesyes
k8s_controller_kindyesyes
k8s_controller_nameyesyes
k8s_pod_uidyesyes
k8s_pod_nameyesyes
k8s_qos_classyesyes
k8s_container_idyes
k8s_container_nameyes

Configuration

No configuration is needed. This module is enabled when you install Netdata using netdata/helmchart.

Troubleshooting

To troubleshoot issues with the k8s_state collector, run the go.d.plugin with the debug option enabled. The output should give you clues as to why the collector isn't working.

First, navigate to your plugins directory, usually at /usr/libexec/netdata/plugins.d/. If that's not the case on your system, open netdata.conf and look for the setting plugins directory. Once you're in the plugin's directory, switch to the netdata user.

cd /usr/libexec/netdata/plugins.d/
sudo -u netdata -s

You can now run the go.d.plugin to debug the collector:

./go.d.plugin -d -m k8s_state

Was this page helpful?

Contribute