Skip to main content

Nodes Ephemerality in Netdata

Node Types

Netdata categorizes nodes into two types:

TypeDescriptionCommon Use Cases
EphemeralExpected to disconnect or reconnect frequently- Auto-scaling cloud instances
- Dynamic containers and VMs
- IoT devices with intermittent connectivity
- Development/test environments with frequent restarts
PermanentExpected to maintain continuous connectivity- Production servers
- Core infrastructure nodes
- Critical monitoring systems
- Stable database servers

Note: Disconnections in permanent nodes indicate potential system failures and require immediate attention.

Key Benefits

  1. Reduced Alert Noise: Disconnection alerts now apply only to permanent nodes, helping teams focus on actual issues.
  2. Improved Dynamic Infrastructure Support: Auto-scaling cloud instances, containers, and other temporary resources can be designated as ephemeral to prevent unnecessary alerts.
  3. Automated Node Cleanup: Ephemeral nodes are removed based on configurable retention periods, keeping dashboards relevant and uncluttered.

Configuring Ephemeral Nodes

By default, Netdata treats all nodes as permanent. To mark a node as ephemeral:

  1. Open the netdata.conf file on the target node.
  2. Add the following configuration:
    [global]
    is ephemeral node = yes
  3. Restart the node.

This setting applies the _is_ephemeral host label, which propagates to Netdata Parents and Netdata Cloud.

Alerts for Parent Nodes

Netdata v2.3.0 introduces two new alerts specifically for permanent nodes:

AlertTrigger Condition
streaming_never_connectedA permanent node has never connected to a Netdata Parent.
streaming_disconnectedA previously connected permanent node has disconnected.

Monitoring Child Node Status

To investigate an alert:

  1. Open the Top tab in your Netdata dashboard.
  2. Select the Netdata-streaming function.
  3. Review the node status table:
    • Red lines: Connection issues when nodes attempt to connect to a Parent.
    • Yellow lines: Restreaming issues when a Parent streams data to another Parent.
    • Color highlighting applies only to permanent nodes.
    • Use the Ephemerality filter to view only permanent nodes.
    • Check InStatus, InReason, and InAge for incoming connection status.
    • Check OutStatus, OutReason, and OutAge for outgoing streaming status.

Managing Offline Nodes

The Netdata CLI tool has two commands for working with archived nodes.

mark-stale-nodes-ephemeral

To mark a permanently offline nodes, including virtual nodes, as ephemeral:

netdatacli mark-stale-nodes-ephemeral <node_id | machine_guid | hostname | ALL_NODES>

This keeps the previously collected metrics data available for querying and clears any active alerts.

Note: Nodes will revert to permanent status if they reconnect unless explicitly configured as ephemeral in netdata.conf.

remove-stale-node

To fully remove permanently offline nodes:

netdatacli remove-stale-node <node_id | machine_guid | hostname | ALL_NODES>

This is like the mark-stale-nodes-ephemeral subcommand, but it also removes the nodes so they are no longer available for querying.

Cloud Integration

In Netdata Cloud, ephemeral nodes remain visible but marked as 'stale' as long as at least one Agent reports having queryable metrics data for that node. Once all Agents report the node as offline, ephemeral nodes are automatically removed from Cloud.

From v2.3.0 onward, Netdata Cloud sends unreachable-node notifications only for permanent nodes, reducing unnecessary alerts.

Automatically Removing Ephemeral Nodes

By default, Netdata does not automatically remove disconnected ephemeral nodes. To enable automatic cleanup:

  1. Open the netdata.conf file on Netdata Parent nodes.
  2. Add the following configuration:
    [db]
    cleanup ephemeral hosts after = 1d
  3. Restart the node.

This setting removes ephemeral nodes from queries after 24 hours of disconnection. Once all parent nodes remove a node, Netdata Cloud automatically deletes it as well.


Do you have any feedback for this page? If so, you can open a new issue on our netdata/learn repository.