Kubernetes discovery
Kind: k8s
Overview
Netdata can automatically discover monitorable workloads inside a Kubernetes cluster — pods (with their containers and ports) or Services. The discoverer watches the Kubernetes API in real time, exposes per-pod-container or per-service-port targets to the rule engine, and lets you generate collector jobs from labels, annotations, container images, and ports.
This page covers Kubernetes-specific setup. For the broader Service Discovery model and the shared template-helper reference, see Service Discovery.
How it works
Each Kubernetes discovery pipeline runs as either a pod discoverer or a service discoverer (selected by the role option). It then:
- Connects to the Kubernetes API using the in-cluster service-account credentials (no
api_serverconfig — the discoverer uses the standard k8s client config-loader chain). - Watches Pods (or Services) in the configured
namespaces[], optionally narrowed by label/field selectors. - Builds targets:
role: pod→ one target per(pod, container, container-port)triple. Container env, image, labels, annotations, and node name are all exposed.role: service→ one target per(service, service-port)pair, with the cluster-internal DNS name (name.ns.svc:port) as.Address.
- Runs the
services:rules against each target, producing collector jobs. - Reconciles in real time — pod/service add/update/delete events update the target set without polling.
Limitations
- Stock conf ships in the Helm chart, not this repo: a stock
/etc/netdata/go.d/sd/k8s.confis not packaged with the agent. On Kubernetes deployments you should install Netdata via the Helm chart — the chart renders both the discoverer config and a curated rule set tailored to your cluster's Netdata setup. - Outside Kubernetes: this discoverer requires kube-API access (in-cluster service-account or kubeconfig). Running it on a workstation requires a kubeconfig and is not a typical use case.
- Two roles per pipeline, never both:
roleis a single-valued option. If you want both pod and service discovery, configure two pipelines. local_modefor pods is opt-in: by default the pod discoverer watches all pods in the configured namespaces. Setpod.local_mode: trueto restrict to pods on the same node as the Netdata Agent (intended for the parent-on-every-node Helm topology). Whenlocal_modeis enabled, the env varMY_NODE_NAMEmust be set on the Netdata pod (the Helm chart sets this via the downward API).- TLS to the API server is mTLS via the in-cluster CA bundle — there is no per-pipeline TLS configuration to override.
Setup
You can configure the k8s discoverer in two ways:
| Method | Best for | How to |
|---|---|---|
| UI | Fast setup without editing files | Go to Collectors -> go.d -> ServiceDiscovery -> k8s, then add a discovery pipeline. |
| File | File-based configuration or automation | Edit /etc/netdata/go.d/sd/k8s.conf and define the discoverer: and services: blocks. |
Prerequisites
Run on Kubernetes via the Netdata Helm chart
The supported way to run the k8s discoverer is via the Netdata Helm chart. The chart provisions the right RBAC (get/list/watch on pods, services, configmaps, secrets), wires MY_NODE_NAME for local_mode, and ships a stock services: rule set tuned to its parent/child topology.
RBAC permissions
The discoverer needs the following verbs from its service account:
pods:get,list,watch(cluster-wide or per-namespace, matchingnamespaces[])services:get,list,watch(only whenrole: service)configmaps,secrets:get,list,watch(only whenrole: pod— used to enrich pod targets with referenced env values)
The Helm chart's default RBAC role covers all of these.
For pod.local_mode: true, set MY_NODE_NAME
When local_mode is enabled, the Netdata Agent reads its node name from MY_NODE_NAME. The Helm chart sets this via the downward API:
env:
- name: MY_NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
Configuration
Options
The configuration file has two top-level blocks: discoverer: (the options below) and services: (rules that turn discovered pods/services into collector jobs — see Service Rules).
After editing the file, restart the Netdata Agent to load the updated discovery pipeline. The default and recommended deployment path on Kubernetes is the Netdata Helm chart — the chart renders this file and the rules for you.
| Option | Description | Default | Required |
|---|---|---|---|
| role | What to discover. One of pod or service. | yes | |
| namespaces | Namespaces to watch. Empty means all namespaces. | [] (all namespaces) | no |
| selector.label | Label selector applied at watch time (server-side filtering). | no | |
| selector.field | Field selector applied at watch time. | no | |
| pod.local_mode | Restrict pod discovery to pods on the same node as the Netdata Agent. | false | no |
role
pod— produces one target per(pod, container, port)triple. Use this for the bulk of in-cluster monitoring (databases, exporters, applications).service— produces one target per(service, port)pair. Use this for cluster-internal endpoints monitored at the service-name DNS level.
To watch both, configure two pipelines.
selector.label
Standard Kubernetes label-selector syntax: app=foo, environment in (prod, staging), etc. Reduces watch traffic when only a subset of pods/services is interesting.
selector.field
Useful field selectors: status.phase=Running, spec.nodeName=node-1. When pod.local_mode: true, the discoverer automatically appends spec.nodeName=$MY_NODE_NAME.
pod.local_mode
Only applies when role: pod. Requires MY_NODE_NAME to be set on the Netdata container. Used by the Helm chart's parent-on-every-node topology to keep watch traffic local.
via UI
- Open the Netdata Dynamic Configuration UI.
- Go to
Collectors -> go.d -> ServiceDiscovery -> k8s. - Add a new discovery pipeline and give it a name.
- Fill in the discoverer-specific settings and the service rules.
- Save the discovery pipeline.
via File
Define the discovery pipeline in /etc/netdata/go.d/sd/k8s.conf.
The file has two top-level blocks: discoverer: (the options above) and services: (rules that turn discovered targets into collector jobs — see Service Rules).
After editing the file, restart the Netdata Agent to load the updated discovery pipeline.
Examples
Pod discovery, local mode (Helm-style)
The configuration the Helm chart renders by default for the parent-on-every-node topology.
disabled: no
discoverer:
k8s:
role: pod
pod:
local_mode: true
services: [ ]
Service discovery in a specific namespace
Watch only Services in the monitoring namespace, scoped by a label selector.
disabled: no
discoverer:
k8s:
role: service
namespaces:
- monitoring
selector:
label: app.kubernetes.io/component=metrics-endpoint
services: [ ]
Service Rules
A services: rule turns each discovered pod-container target (role: pod) or service-port target (role: service) into one or more collector jobs. The two target shapes have different fields — annotations and labels are common to both, but pod targets additionally expose container-level info (image, env, controller).
The shared rule model — function reference (match, glob, hasKey, index, sprig), config_template rendering rules, and the missingkey=error failure semantics — lives on the Service Discovery hub page. The notes below are k8s-specific.
How rules are evaluated
Quick reference — see Rule evaluation semantics on the hub page for the full model.
- Different target shape per role —
role: podandrole: serviceproduce different target structs. Rules in a pipeline must assume one shape — design your pipeline to match the discoverer'srole. To handle both, run two pipelines. - Annotation-driven matching is idiomatic — Standard Kubernetes practice is to opt pods/services into monitoring via annotations (e.g.
prometheus.io/scrape: "true",netdata.cloud/scrape: "true"). UsehasKey .Annotations "key"andindex .Annotations "key"to read them. - Container ports vs. service ports — Pod targets expose
.Port/.PortName/.PortProtocolfrom the container'sports[]. Service targets expose them from the service'sports[]. Container ports may not be advertised through a Service — when you want both granularities, run two pipelines. - Module inference from rule id — For Kubernetes, set
id: <module-name>so the rendered job inherits the module name automatically — same as the other discoverers.
Template Variables
Two distinct target shapes — PodTarget for role: pod and ServiceTarget for role: service.
| Variable | Type | Description |
|---|---|---|
.Address | string | For pods: <pod-IP>:<port> (or just <pod-IP> when no container port is exposed). For services: <svc-name>.<namespace>.svc:<port>. |
.Namespace | string | Pod/Service namespace. |
.Name | string | Pod or Service name. |
.Annotations | map | Pod/Service annotations. Read with index .Annotations "key". |
.Labels | map | Pod/Service labels. Read with index .Labels "key". |
.Port | string | Container port (pod target) or service port (service target). |
.PortName | string | Port name as declared in the spec (http, metrics, …). |
.PortProtocol | string | Port protocol (TCP, UDP). |
.PodIP | string | Pod targets only. IP address of the pod. |
.NodeName | string | Pod targets only. Name of the node hosting the pod. |
.ContName | string | Pod targets only. Container name (within the pod). |
.Image | string | Pod targets only. Container image. |
.Env | map | Pod targets only. Container environment, with values from referenced ConfigMaps and Secrets resolved. |
.ControllerName | string | Pod targets only. Owning controller name (e.g. ReplicaSet name). |
.ControllerKind | string | Pod targets only. Owning controller kind (ReplicaSet, StatefulSet, DaemonSet, Job, …). |
.ClusterIP | string | Service targets only. Cluster IP. |
.ExternalName | string | Service targets only. External name (for type: ExternalName services). |
.Type | string | Service targets only. Service type (ClusterIP, NodePort, LoadBalancer, ExternalName). |
Examples
Each example shows one entry from the services: array. Order matters — see How rules are evaluated.
Pod with prometheus.io/scrape annotation
The de-facto standard "scrape me" annotation. Match pods that opt in, route to the prometheus module.
- id: prometheus
match: '{{ and (hasKey .Annotations "prometheus.io/scrape") (eq (index .Annotations "prometheus.io/scrape") "true") }}'
config_template: |
name: {{ .Namespace }}_{{ .Name }}_{{ .ContName }}
url: http://{{ .Address }}{{ index .Annotations "prometheus.io/path" | default "/metrics" }}
Service-role: monitor each metrics-endpoint Service
Run with role: service. Match Services that carry a metrics-endpoint component label.
- id: prometheus
match: '{{ and (hasKey .Labels "app.kubernetes.io/component") (eq (index .Labels "app.kubernetes.io/component") "metrics-endpoint") }}'
config_template: |
name: {{ .Namespace }}_{{ .Name }}
url: http://{{ .Address }}/metrics
Image-driven: nginx pods
Match nginx-image pods on a known port. Use match "sp" for the four-form image family.
- id: nginx
match: '{{ and (eq .Port "80") (match "sp" .Image "nginx nginx:* */nginx */nginx:*") }}'
config_template: |
name: {{ .Namespace }}_{{ .Name }}
url: http://{{ .Address }}/stub_status
Verify discovery worked
After enabling the discoverer, confirm it is watching the API and producing targets.
Confirm the discoverer registered
Watch the Netdata Agent log inside the pod for discoverer=kubernetes messages:
kubectl logs -n netdata <netdata-pod> | grep "discoverer=kubernetes"
On startup you should see "instance is started", role information, and which namespaces are being watched. RBAC failures appear as forbidden errors from the watch.
Confirm the API is reachable
From the pod:
kubectl exec -n netdata <netdata-pod> -- curl -sSk \
-H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" \
https://kubernetes.default.svc/api/v1/namespaces
A 401 / 403 indicates the service account lacks the right RBAC. The Helm chart provisions the correct role.
Confirm jobs are being created
In the Netdata UI go to Collectors -> go.d -> <module>. Job names follow your config_template — the examples above use <namespace>_<name> patterns.
Troubleshooting
Permission denied (RBAC)
The service account needs get, list, watch on pods (or services), and on configmaps + secrets for pod-role env enrichment. The Helm chart provisions this; out-of-Helm deployments must bind the equivalent role.
local_mode enabled but env "MY_NODE_NAME" not set
When pod.local_mode: true is set but MY_NODE_NAME is missing, the discoverer fails at startup with local_mode is enabled, but env 'MY_NODE_NAME' not set. Set the env via the downward API on the Netdata pod (the Helm chart does this).
No targets discovered
- Confirm pods/services exist in the configured
namespaces[]. - If
selector.labelorselector.fieldis set, verify the targets actually carry the matching labels/fields. - With
local_mode, only pods on the same node as the Netdata pod are visible.
Generated jobs fail to start
The Address resolves to the pod's CNI IP — the Netdata Agent must be able to reach pod IPs. Most CNIs allow this from a pod running in the same cluster, but flat-network requirements differ. For service-role targets, the cluster-internal DNS name (<svc>.<ns>.svc) is used and should always resolve from inside the cluster.
Do you have any feedback for this page? If so, you can open a new issue on our netdata/learn repository.