Agent Alert Notifications

Netdata's Agent can send alert notifications directly from each node. It supports a wide range of services, multiple recipients, and role-based routing.

How It Works

The Agent uses a notification script defined in netdata.conf under the [health] section:

script to execute on alarm = /usr/libexec/netdata/plugins.d/alarm-notify.sh

The default script is alarm-notify.sh.

This script handles:

Multiple recipients
Multiple notification methods
Role-based routing (e.g., sysadmin, webmaster, dba)

Role-Based Routing Visualization

Health Management API Workflow

Quick Setup

tip

Use the edit-config script to safely edit configuration files. It automatically creates the necessary files in the right place and opens them in your editor. Learn how to use edit-config

Open the Agent's health notification config:
```
sudo ./edit-config health_alarm_notify.conf
```
Set up the required API keys or credentials for the service you want to use.
Define recipients per role (see below).
Restart the Agent for changes to take effect:
```
sudo systemctl restart netdata
```

Example: Alert with Role-Based Routing

Here's an example alert assigned to the sysadmin role from the ram.conf file:

alarm: ram_in_use
   on: system.ram
class: Utilization
 type: System
component: Memory
     os: linux
  hosts: *
   calc: $used * 100 / ($used + $cached + $free + $buffers)
  units: %
  every: 10s
   warn: $this > (($status >= $WARNING)  ? (80) : (90))
   crit: $this > (($status == $CRITICAL) ? (90) : (98))
  delay: down 15m multiplier 1.5 max 1h
   info: system memory utilization
     to: sysadmin

Then, in health_alarm_notify.conf, you assign recipients per notification method:

role_recipients_email[sysadmin]="[email protected] [email protected]"
role_recipients_slack[sysadmin]="#alerts #infra"

Advanced Role-Based Routing Examples

DevOps Team Example

# Backend team receives database and application server alerts
role_recipients_slack[backend]="#backend-team"
role_recipients_pagerduty[backend]="PDK3Y5EXAMPLE"

# Frontend team receives web server and CDN alerts
role_recipients_slack[frontend]="#frontend-team"
role_recipients_opsgenie[frontend]="key1example"

# Security team receives all security-related alerts
role_recipients_email[security]="[email protected]"
role_recipients_slack[security]="#security-alerts"

# SRE team receives critical infrastructure alerts 24/7
role_recipients_slack[sre]="#sre-alerts"
role_recipients_pagerduty[sre]="PDK3Y5SREXAMPLE"
role_recipients_telegram[sre]="123456789"

Time-Based Routing Example

You can use external scripts to dynamically change recipients based on work hours, on-call schedules, etc.:

# Use a script to determine the current on-call engineer
ONCALL_EMAIL=$(get_oncall_email.sh)
role_recipients_email[oncall]="$\{ONCALL_EMAIL}"
role_recipients_sms[oncall]="$\{ONCALL_PHONE}"

# Standard business hours team gets non-critical alerts during work hours
role_recipients_slack[business_hours]="#daytime-monitoring"

Health Management API

Netdata provides a powerful Health Management API that lets you control alert behavior during maintenance windows, testing, or other planned activities.

API Authorization

The API is protected by an authorization token stored in /var/lib/netdata/netdata.api.key:

# Get your token
TOKEN=$(cat /var/lib/netdata/netdata.api.key)

# Use the token in API calls
curl "http://localhost:19999/api/v1/manage/health?cmd=RESET" -H "X-Auth-Token: $\{TOKEN}"

Common API Commands

Disable All Health Checks

Completely stops evaluation of health checks during maintenance:

curl "http://localhost:19999/api/v1/manage/health?cmd=DISABLE ALL" -H "X-Auth-Token: $\{TOKEN}"

Silence All Notifications

Continues to evaluate health checks but prevents notifications:

curl "http://localhost:19999/api/v1/manage/health?cmd=SILENCE ALL" -H "X-Auth-Token: $\{TOKEN}"

Disable Specific Alerts

Target only certain alerts by name, chart, context, host, or family:

# Silence all disk space alerts
curl "http://localhost:19999/api/v1/manage/health?cmd=SILENCE&context=disk_space" -H "X-Auth-Token: $\{TOKEN}"

# Disable CPU alerts for specific hosts
curl "http://localhost:19999/api/v1/manage/health?cmd=DISABLE&context=cpu&hosts=prod-db-*" -H "X-Auth-Token: $\{TOKEN}"

View Current Silenced/Disabled Alerts

Check what's currently silenced or disabled:

curl "http://localhost:19999/api/v1/manage/health?cmd=LIST" -H "X-Auth-Token: $\{TOKEN}"

Reset to Normal Operation

Re-enable all health checks and notifications:

curl "http://localhost:19999/api/v1/manage/health?cmd=RESET" -H "X-Auth-Token: $\{TOKEN}"

Configuration Options

Recipients Per Role

Define who receives alerts and how:

role_recipients_email[sysadmin]="[email protected]"
role_recipients_telegram[webmaster]="123456789"
role_recipients_slack[dba]="#database-alerts"

Use spaces to separate multiple recipients.

To disable a notification method for a role, use:

role_recipients_email[sysadmin]="disabled"

If left empty, the default recipient for that method is used.

Alert Severity Filtering

You can limit certain recipients to only receive critical alerts:

role_recipients_email[sysadmin]="[email protected] [email protected]|critical"

This setup:

Sends all alerts to [email protected]
Sends only critical-related alerts to [email protected]

Works for all supported methods: email, Slack, Telegram, Twilio, Discord, etc.

Proxy Settings

To send notifications via a proxy, set these environment variables:

export http_proxy="http://10.0.0.1:3128/"
export https_proxy="http://10.0.0.1:3128/"

Notification Images

By default, Netdata includes public image URLs in notifications (hosted by the global Registry).

To use custom image paths:

images_base_url="http://my.public.netdata.server:19999"

Custom Date Format

Change the timestamp format in notifications:

date_format="+%F %T%:z"   # Example: RFC 3339

Common formats:

Format	String
ISO 8601	`+%FT%T%z`
RFC 5322	`+%a, %d %b %Y %H:%M:%S %z`
RFC 3339	`+%F %T%:z`
Local time	`+%x %X`
ANSI C / asctime()	(leave empty)

See man date for more formatting options.

Hostname Format

By default, Netdata uses the short hostname in notifications.

To use the fully qualified domain name (FQDN), set:

use_fqdn=YES

If you've set a custom hostname in netdata.conf, that value takes priority.

Testing Your Notification Setup

You can test alert notifications manually.

# Switch to the Netdata user
sudo su -s /bin/bash netdata

# Enable debugging
export NETDATA_ALARM_NOTIFY_DEBUG=1

# Test default role (sysadmin)
./plugins.d/alarm-notify.sh test

# Test specific role
./plugins.d/alarm-notify.sh test "webmaster"

important

If you're running your own Netdata Registry, set:

export NETDATA_REGISTRY_URL="https://your.registry.url"

before testing.

Debugging with Trace

To see the full execution output:

bash -x ./plugins.d/alarm-notify.sh test

Then look for the internal calls and re-run the one you want to trace in more detail.

Troubleshooting Alert Notifications

Here are solutions for common alert notification issues:

Email Notifications Not Working

Verify your email configuration:

grep -E "SEND_EMAIL|DEFAULT_RECIPIENT_EMAIL" /etc/netdata/health_alarm_notify.conf

Check if the system can send mail:

echo "Test" | mail -s "Test Email" [email protected]

Look for errors in the Netdata log:

tail -f /var/log/netdata/error.log | grep "alarm notify"

Test with debugging enabled:

sudo su -s /bin/bash netdata
export NETDATA_ALARM_NOTIFY_DEBUG=1
./plugins.d/alarm-notify.sh test

Slack Notifications Failing

Verify your webhook URL is correct:

grep -E "SLACK_WEBHOOK_URL" /etc/netdata/health_alarm_notify.conf

Check for network connectivity to Slack:

curl -X POST -H "Content-type: application/json" --data '{"text":"Test"}' YOUR_WEBHOOK_URL

Confirm channel names start with # in your configuration.

PagerDuty Integration Issues

Verify your service key:

grep -E "PAGERDUTY_SERVICE_KEY" /etc/netdata/health_alarm_notify.conf

Test the PagerDuty API directly:

curl -H "Content-Type: application/json" -X POST -d '{"service_key":"YOUR_SERVICE_KEY","event_type":"trigger","description":"Test"}' https://events.pagerduty.com/generic/2010-04-15/create_event.json

Notification Delays

If notifications seem delayed:

Check the delay parameter in your alarm configuration
Verify your health.d/*.conf files for delay settings
Check the ALARM_NOTIFY_DELAY setting in health_alarm_notify.conf

Do you have any feedback for this page? If so, you can open a new issue on our netdata/learn repository.

How It Works​

Role-Based Routing Visualization​

Health Management API Workflow​

Quick Setup​

Example: Alert with Role-Based Routing​

Advanced Role-Based Routing Examples​

Health Management API​

API Authorization​

Common API Commands​

Configuration Options​

Testing Your Notification Setup​

Debugging with Trace​

Troubleshooting Alert Notifications​

Email Notifications Not Working​

Slack Notifications Failing​

PagerDuty Integration Issues​

Notification Delays​

Related Docs​