Netdata Cloud features enhanced visualizations for the resource utilization of Kubernetes (k8s) clusters, embedded in the default Overview dashboard.
These visualizations include a health map for viewing the status of k8s pods/containers, in addition to composite charts for viewing per-second CPU, memory, disk, and networking metrics from k8s nodes.
See our Kubernetes deployment instructions for details on installation and connecting to Netdata Cloud.
Available Kubernetes metrics
Netdata Cloud organizes and visualizes the following metrics from your Kubernetes cluster from every container:
cpu_limit: CPU utilization as a percentage of the limit defined by the pod specification
cpu: CPU utilization of the pod/container. 100% usage equals 1 fully-utilized core, 200% equals 2 fully-utilized cores, and so on.
cpu_per_core: CPU utilization averaged across available cores.
mem_usage_limit: Memory utilization, without cache, as a percentage of the limit defined by the pod specification
mem_usage: Used memory, without cache.
mem: The sum of
rss(resident set size) memory usage.
writeback: The size of
mem_activity: Sum of
pgfaults: Sum of page fault bandwidth, which are raised when the Kubernetes cluster tries accessing a memory page that is mapped into the virtual address space, but not actually loaded into main memory.
throttle_io: Sum of
writeper second across all PVs/PVCs attached to the container.
throttle_serviced_ops: Sum of the
writeoperations per second across all PVs/PVCs attached to the container.
net.net: Sum of
sentbandwidth per second.
net.packets: Sum of
When viewing the health map, Netdata Cloud shows the above metrics per container, or aggregated based on their associated pods.
When viewing the composite charts, Netdata Cloud aggregates metrics from multiple nodes, pods, or
containers, depending on the grouping chosen. For example, if you group the
cpu_limit composite chart by
k8s_namespace, the metrics shown will be the average of
cpu_limit metrics from all nodes/pods/containers that are
part of that namespace.
The health map places each container or pod as a single box, then varies the intensity of its color to visualize the resource utilization of specific k8s pods/containers.
Change the health map's coloring, grouping, and displayed nodes to customize your experience and learn more about the status of your k8s cluster.
Color the health map by choosing an aggregate function to apply to an available Kubernetes metric, then whether you to display boxes for individual pods or containers.
The default is the average, of CPU within the configured limit, organized by container.
Group the health map by the
k8s_node_name. The default is
Filtering behaves identically to the node filter in War Rooms, with the ability to
filter pods/containers by
Hover over any of the pods/containers in the map to display a modal window, which contains contextual information and real-time metrics from that resource.
The context tab provides the following details about a container or pod:
- Cluster ID
- Controller Kind
- Controller Name
- Pod Name
- Pod UID
This information helps orient you as to where the container/pod operates inside your cluster.
The Metrics tab contains charts visualizing the last 15 minutes of the same metrics available in the color by
option. Use these metrics along with the context, to identify which containers or pods are experiencing
problematic behavior to investigate further, troubleshoot, and remediate with
kubectl or another tool.
The Kubernetes composite charts show real-time and historical resource utilization metrics from nodes, pods, or containers within your Kubernetes deployment.
See the Overview doc for details on how composite charts work. These work similarly, but in addition to visualizing by dimension and by node, Kubernetes composite charts can also be grouped by the following labels:
In addition, when you hover over a composite chart, the colors in the heat map changes as well, so you can see how certain pod/container-level metrics change over time.
There are some caveats and known issues with Kubernetes monitoring with Netdata Cloud.
- No way to remove any nodes you might have drained from your Kubernetes cluster. These drained nodes will be marked "unreachable" and will show up in War Room management screens/dropdowns. The same applies for any ephemeral nodes created and destroyed during horizontal scaling.
Do you have any feedback for this page? If so, you can open a new issue on our netdata/learn repository.