Skip to main content

Kubernetes discovery

Kind: k8s

Overview

Netdata can automatically discover monitorable workloads inside a Kubernetes cluster — pods (with their containers and ports) or Services. The discoverer watches the Kubernetes API in real time, exposes per-pod-container or per-service-port targets to the rule engine, and lets you generate collector jobs from labels, annotations, container images, and ports.

This page covers Kubernetes-specific setup. For the broader Service Discovery model and the shared template-helper reference, see Service Discovery.

How it works

Each Kubernetes discovery pipeline runs as either a pod discoverer or a service discoverer (selected by the role option). It then:

  1. Connects to the Kubernetes API using the in-cluster service-account credentials (no api_server config — the discoverer uses the standard k8s client config-loader chain).
  2. Watches Pods (or Services) in the configured namespaces[], optionally narrowed by label/field selectors.
  3. Builds targets:
    • role: pod → one target per (pod, container, container-port) triple. Container env, image, labels, annotations, and node name are all exposed.
    • role: service → one target per (service, service-port) pair, with the cluster-internal DNS name (name.ns.svc:port) as .Address.
  4. Runs the services: rules against each target, producing collector jobs.
  5. Reconciles in real time — pod/service add/update/delete events update the target set without polling.

Limitations

  • Stock conf ships in the Helm chart, not this repo: a stock /etc/netdata/go.d/sd/k8s.conf is not packaged with the agent. On Kubernetes deployments you should install Netdata via the Helm chart — the chart renders both the discoverer config and a curated rule set tailored to your cluster's Netdata setup.
  • Outside Kubernetes: this discoverer requires kube-API access (in-cluster service-account or kubeconfig). Running it on a workstation requires a kubeconfig and is not a typical use case.
  • Two roles per pipeline, never both: role is a single-valued option. If you want both pod and service discovery, configure two pipelines.
  • local_mode for pods is opt-in: by default the pod discoverer watches all pods in the configured namespaces. Set pod.local_mode: true to restrict to pods on the same node as the Netdata Agent (intended for the parent-on-every-node Helm topology). When local_mode is enabled, the env var MY_NODE_NAME must be set on the Netdata pod (the Helm chart sets this via the downward API).
  • TLS to the API server is mTLS via the in-cluster CA bundle — there is no per-pipeline TLS configuration to override.

Setup

You can configure the k8s discoverer in two ways:

MethodBest forHow to
UIFast setup without editing filesGo to Collectors -> go.d -> ServiceDiscovery -> k8s, then add a discovery pipeline.
FileFile-based configuration or automationEdit /etc/netdata/go.d/sd/k8s.conf and define the discoverer: and services: blocks.

Prerequisites

Run on Kubernetes via the Netdata Helm chart

The supported way to run the k8s discoverer is via the Netdata Helm chart. The chart provisions the right RBAC (get/list/watch on pods, services, configmaps, secrets), wires MY_NODE_NAME for local_mode, and ships a stock services: rule set tuned to its parent/child topology.

RBAC permissions

The discoverer needs the following verbs from its service account:

  • pods: get, list, watch (cluster-wide or per-namespace, matching namespaces[])
  • services: get, list, watch (only when role: service)
  • configmaps, secrets: get, list, watch (only when role: pod — used to enrich pod targets with referenced env values)

The Helm chart's default RBAC role covers all of these.

For pod.local_mode: true, set MY_NODE_NAME

When local_mode is enabled, the Netdata Agent reads its node name from MY_NODE_NAME. The Helm chart sets this via the downward API:

env:
- name: MY_NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName

Configuration

Options

The configuration file has two top-level blocks: discoverer: (the options below) and services: (rules that turn discovered pods/services into collector jobs — see Service Rules).

After editing the file, restart the Netdata Agent to load the updated discovery pipeline. The default and recommended deployment path on Kubernetes is the Netdata Helm chart — the chart renders this file and the rules for you.

OptionDescriptionDefaultRequired
roleWhat to discover. One of pod or service.yes
namespacesNamespaces to watch. Empty means all namespaces.[] (all namespaces)no
selector.labelLabel selector applied at watch time (server-side filtering).no
selector.fieldField selector applied at watch time.no
pod.local_modeRestrict pod discovery to pods on the same node as the Netdata Agent.falseno
role
  • pod — produces one target per (pod, container, port) triple. Use this for the bulk of in-cluster monitoring (databases, exporters, applications).
  • service — produces one target per (service, port) pair. Use this for cluster-internal endpoints monitored at the service-name DNS level.

To watch both, configure two pipelines.

selector.label

Standard Kubernetes label-selector syntax: app=foo, environment in (prod, staging), etc. Reduces watch traffic when only a subset of pods/services is interesting.

selector.field

Useful field selectors: status.phase=Running, spec.nodeName=node-1. When pod.local_mode: true, the discoverer automatically appends spec.nodeName=$MY_NODE_NAME.

pod.local_mode

Only applies when role: pod. Requires MY_NODE_NAME to be set on the Netdata container. Used by the Helm chart's parent-on-every-node topology to keep watch traffic local.

via UI

  1. Open the Netdata Dynamic Configuration UI.
  2. Go to Collectors -> go.d -> ServiceDiscovery -> k8s.
  3. Add a new discovery pipeline and give it a name.
  4. Fill in the discoverer-specific settings and the service rules.
  5. Save the discovery pipeline.

via File

Define the discovery pipeline in /etc/netdata/go.d/sd/k8s.conf.

The file has two top-level blocks: discoverer: (the options above) and services: (rules that turn discovered targets into collector jobs — see Service Rules).

After editing the file, restart the Netdata Agent to load the updated discovery pipeline.

Examples
Pod discovery, local mode (Helm-style)

The configuration the Helm chart renders by default for the parent-on-every-node topology.

disabled: no
discoverer:
k8s:
role: pod
pod:
local_mode: true
services: [ ]

Service discovery in a specific namespace

Watch only Services in the monitoring namespace, scoped by a label selector.

disabled: no
discoverer:
k8s:
role: service
namespaces:
- monitoring
selector:
label: app.kubernetes.io/component=metrics-endpoint
services: [ ]

Service Rules

A services: rule turns each discovered pod-container target (role: pod) or service-port target (role: service) into one or more collector jobs. The two target shapes have different fields — annotations and labels are common to both, but pod targets additionally expose container-level info (image, env, controller).

The shared rule model — function reference (match, glob, hasKey, index, sprig), config_template rendering rules, and the missingkey=error failure semantics — lives on the Service Discovery hub page. The notes below are k8s-specific.

How rules are evaluated

Quick reference — see Rule evaluation semantics on the hub page for the full model.

  • Different target shape per rolerole: pod and role: service produce different target structs. Rules in a pipeline must assume one shape — design your pipeline to match the discoverer's role. To handle both, run two pipelines.
  • Annotation-driven matching is idiomatic — Standard Kubernetes practice is to opt pods/services into monitoring via annotations (e.g. prometheus.io/scrape: "true", netdata.cloud/scrape: "true"). Use hasKey .Annotations "key" and index .Annotations "key" to read them.
  • Container ports vs. service ports — Pod targets expose .Port / .PortName / .PortProtocol from the container's ports[]. Service targets expose them from the service's ports[]. Container ports may not be advertised through a Service — when you want both granularities, run two pipelines.
  • Module inference from rule id — For Kubernetes, set id: <module-name> so the rendered job inherits the module name automatically — same as the other discoverers.

Template Variables

Two distinct target shapes — PodTarget for role: pod and ServiceTarget for role: service.

VariableTypeDescription
.AddressstringFor pods: <pod-IP>:<port> (or just <pod-IP> when no container port is exposed). For services: <svc-name>.<namespace>.svc:<port>.
.NamespacestringPod/Service namespace.
.NamestringPod or Service name.
.AnnotationsmapPod/Service annotations. Read with index .Annotations "key".
.LabelsmapPod/Service labels. Read with index .Labels "key".
.PortstringContainer port (pod target) or service port (service target).
.PortNamestringPort name as declared in the spec (http, metrics, …).
.PortProtocolstringPort protocol (TCP, UDP).
.PodIPstringPod targets only. IP address of the pod.
.NodeNamestringPod targets only. Name of the node hosting the pod.
.ContNamestringPod targets only. Container name (within the pod).
.ImagestringPod targets only. Container image.
.EnvmapPod targets only. Container environment, with values from referenced ConfigMaps and Secrets resolved.
.ControllerNamestringPod targets only. Owning controller name (e.g. ReplicaSet name).
.ControllerKindstringPod targets only. Owning controller kind (ReplicaSet, StatefulSet, DaemonSet, Job, …).
.ClusterIPstringService targets only. Cluster IP.
.ExternalNamestringService targets only. External name (for type: ExternalName services).
.TypestringService targets only. Service type (ClusterIP, NodePort, LoadBalancer, ExternalName).

Examples

Each example shows one entry from the services: array. Order matters — see How rules are evaluated.

Pod with prometheus.io/scrape annotation

The de-facto standard "scrape me" annotation. Match pods that opt in, route to the prometheus module.

- id: prometheus
match: '{{ and (hasKey .Annotations "prometheus.io/scrape") (eq (index .Annotations "prometheus.io/scrape") "true") }}'
config_template: |
name: {{ .Namespace }}_{{ .Name }}_{{ .ContName }}
url: http://{{ .Address }}{{ index .Annotations "prometheus.io/path" | default "/metrics" }}

Service-role: monitor each metrics-endpoint Service

Run with role: service. Match Services that carry a metrics-endpoint component label.

- id: prometheus
match: '{{ and (hasKey .Labels "app.kubernetes.io/component") (eq (index .Labels "app.kubernetes.io/component") "metrics-endpoint") }}'
config_template: |
name: {{ .Namespace }}_{{ .Name }}
url: http://{{ .Address }}/metrics

Image-driven: nginx pods

Match nginx-image pods on a known port. Use match "sp" for the four-form image family.

- id: nginx
match: '{{ and (eq .Port "80") (match "sp" .Image "nginx nginx:* */nginx */nginx:*") }}'
config_template: |
name: {{ .Namespace }}_{{ .Name }}
url: http://{{ .Address }}/stub_status

Verify discovery worked

After enabling the discoverer, confirm it is watching the API and producing targets.

Confirm the discoverer registered

Watch the Netdata Agent log inside the pod for discoverer=kubernetes messages:

kubectl logs -n netdata <netdata-pod> | grep "discoverer=kubernetes"

On startup you should see "instance is started", role information, and which namespaces are being watched. RBAC failures appear as forbidden errors from the watch.

Confirm the API is reachable

From the pod:

kubectl exec -n netdata <netdata-pod> -- curl -sSk \
-H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" \
https://kubernetes.default.svc/api/v1/namespaces

A 401 / 403 indicates the service account lacks the right RBAC. The Helm chart provisions the correct role.

Confirm jobs are being created

In the Netdata UI go to Collectors -> go.d -> <module>. Job names follow your config_template — the examples above use <namespace>_<name> patterns.

Troubleshooting

Permission denied (RBAC)

The service account needs get, list, watch on pods (or services), and on configmaps + secrets for pod-role env enrichment. The Helm chart provisions this; out-of-Helm deployments must bind the equivalent role.

local_mode enabled but env "MY_NODE_NAME" not set

When pod.local_mode: true is set but MY_NODE_NAME is missing, the discoverer fails at startup with local_mode is enabled, but env 'MY_NODE_NAME' not set. Set the env via the downward API on the Netdata pod (the Helm chart does this).

No targets discovered

  • Confirm pods/services exist in the configured namespaces[].
  • If selector.label or selector.field is set, verify the targets actually carry the matching labels/fields.
  • With local_mode, only pods on the same node as the Netdata pod are visible.

Generated jobs fail to start

The Address resolves to the pod's CNI IP — the Netdata Agent must be able to reach pod IPs. Most CNIs allow this from a pod running in the same cluster, but flat-network requirements differ. For service-role targets, the cluster-internal DNS name (<svc>.<ns>.svc) is used and should always resolve from inside the cluster.


Do you have any feedback for this page? If so, you can open a new issue on our netdata/learn repository.