Skip to main content

Netdata Logging

This document describes how Netdata generates its own logs, not how Netdata manages and queries logs databases.

Log sources

Netdata supports the following log sources:

  1. daemon, logs generated by Netdata daemon.
  2. collector, logs generated by Netdata collectors, including internal and external ones.
  3. access, API requests received by Netdata
  4. health, all alert transitions and notifications

Log outputs

For each log source, Netdata supports the following output methods:

  • off, to disable this log source
  • journal, to send the logs to systemd-journal.
  • etw, to send the logs to Event Tracing for Windows (ETW).
  • wel, to send the logs to the Windows Event Log (WEL).
  • syslog, to send the logs to syslog.
  • system, to send the output to stderr or stdout depending on the log source.
  • stdout, to write the logs to Netdata's stdout.
  • stderr, to write the logs to Netdata's stderr.
  • filename, to send the logs to a file.

On Linux, when systemd-journal is available, the default is journal for daemon and collector and filename for the rest. To decide if systemd-journal is available, Netdata checks:

  1. stderr is connected to systemd-journald
  2. /run/systemd/journal/socket exists
  3. /host/run/systemd/journal/socket exists (/host is configurable in containers)

If any of the above is detected, Netdata will select journal for daemon and collector sources.

On Windows, the default is etw and if that is not available it falls back to wel. The availability of etw is decided at compile time.

Log formats

FormatDescription
journaljournald-specific log format. Automatically selected when logging to systemd-journal.
etwEvent Tracing for Windows specific format. Structured logging in Event Viewer.
welWindows Event Log specific format. Basic field-based logging in Event Viewer.
journaljournald-specific log format. Automatically selected when logging to systemd-journal.
logfmtlogs data as a series of key/value pairs. The default when logging to any output other than journal.
jsonlogs data in JSON format.

Log levels

Each time Netdata logs, it assigns a priority to the log. It can be one of this (in order of importance):

LevelDescription
emergencya fatal condition, Netdata will most likely exit immediately after.
alerta very important issue that may affect how Netdata operates.
criticala very important issue the user should know which, Netdata thinks it can survive.
erroran error condition indicating that Netdata is trying to do something, but it fails.
warningsomething unexpected has happened that may or may not affect the operation of Netdata.
noticesomething that does not affect the operation of Netdata, but the user should notice.
infothe default log level about information the user should know.
debugthese are more verbose logs that can be ignored.

For etw these are mapped to Verbose, Informational, Warning, Error and Critical. For wel these are mapped to Informational, Warning, Error.

Logs Configuration

In netdata.conf, there are the following settings:

[logs]
# logs to trigger flood protection = 1000
# logs flood protection period = 1m
# facility = daemon
# level = info
# daemon = journal
# collector = journal
# access = /var/log/netdata/access.log
# health = /var/log/netdata/health.log
  • logs to trigger flood protection and logs flood protection period enable logs flood protection for daemon and collector sources. It can also be configured per log source.
  • facility is used only when Netdata logs to syslog.
  • level defines the minimum log level of logs that will be logged. This setting is applied only to daemon and collector sources. It can also be configured per source.

Configuring log sources

Each for the sources (daemon, collector, access, health), accepts the following:

source = {FORMAT},level={LEVEL},protection={LOG}/{PERIOD}@{OUTPUT}

Where:

  • {FORMAT}, is one of the log formats,
  • {LEVEL}, is the minimum log level to be logged,
  • {LOGS} is the number of logs to trigger flood protection configured per output,
  • {PERIOD} is the equivalent of logs flood protection period configured per output,
  • {OUTPUT} is one of the `log outputs,

All parameters can be omitted, except {OUTPUT}. If {OUTPUT} is the only given parameter, @ can be omitted.

Logs rotation

Netdata comes with logrotate configuration to rotate its log files periodically.

The default is usually found in /etc/logrotate.d/netdata.

Sending a SIGHUP to Netdata, will instruct it to re-open all its log files.

Log Fields

All fields exposed by Netdata
journallogfmt and jsonetwwelDescription
_SOURCE_REALTIME_TIMESTAMPtimeTimestamp1the timestamp of the event
SYSLOG_IDENTIFIERcommProgram2the program logging the event
ND_LOG_SOURCEsourceNetdataLogSource3one of the log sources
PRIORITY
numeric
level
text
Level
text
4one of the log levels
ERRNOerrnoUnixErrno5the numeric value of errno
INVOCATION_ID-InvocationID7a unique UUID of the Netdata session, reset on every Netdata restart, inherited by systemd when available
CODE_LINE-CodeLine8the line number of of the source code logging this event
CODE_FILE-CodeFile9the filename of the source code logging this event
CODE_FUNCTION-CodeFunction10the function name of the source code logging this event
TIDtidThreadID11the thread id of the thread logging this event
THREAD_TAGthreadThreadName12the name of the thread logging this event
MESSAGE_IDmsg_idMessageID13see message IDs
ND_MODULEmoduleModule14the Netdata module logging this event
ND_NIDL_NODEnodeNode15the hostname of the node the event is related to
ND_NIDL_INSTANCEinstanceInstance16the instance of the node the event is related to
ND_NIDL_CONTEXTcontextContext17the context the event is related to (this is usually the chart name, as shown on netdata dashboards
ND_NIDL_DIMENSIONdimensionDimension18the dimension the event is related to
ND_SRC_TRANSPORTsrc_transportSourceTransport19when the event happened during a request, this is the request transport
ND_SRC_IPsrc_ipSourceIP24when the event happened during an inbound request, this is the IP the request came from
ND_SRC_PORTsrc_portSourcePort25when the event happened during an inbound request, this is the port the request came from
ND_SRC_FORWARDED_HOSTsrc_forwarded_hostSourceForwardedHost26the contents of the HTTP header X-Forwarded-Host
ND_SRC_FORWARDED_FORsrc_forwarded_forSourceForwardedFor27the contents of the HTTP header X-Forwarded-For
ND_SRC_CAPABILITIESsrc_capabilitiesSourceCapabilities28when the request came from a child, this is the communication capabilities of the child
ND_DST_TRANSPORTdst_transportDestinationTransport29when the event happened during an outbound request, this is the outbound request transport
ND_DST_IPdst_ipDestinationIP30when the event happened during an outbound request, this is the IP the request destination
ND_DST_PORTdst_portDestinationPort31when the event happened during an outbound request, this is the port the request destination
ND_DST_CAPABILITIESdst_capabilitiesDestinationCapabilities32when the request goes to a parent, this is the communication capabilities of the parent
ND_REQUEST_METHODreq_methodRequestMethod33when the event happened during an inbound request, this is the method the request was received
ND_RESPONSE_CODEcodeResponseCode34when responding to a request, this this the response code
ND_CONNECTION_IDconnConnectionID35when there is a connection id for an inbound connection, this is the connection id
ND_TRANSACTION_IDtransactionTransactionID36the transaction id (UUID) of all API requests
ND_RESPONSE_SENT_BYTESsent_bytesResponseSentBytes37the bytes we sent to API responses
ND_RESPONSE_SIZE_BYTESsize_bytesResponseSizeBytes38the uncompressed bytes of the API responses
ND_RESPONSE_PREP_TIME_USECprep_utResponsePreparationTimeUsec39the time needed to prepare a response
ND_RESPONSE_SENT_TIME_USECsent_utResponseSentTimeUsec40the time needed to send a response
ND_RESPONSE_TOTAL_TIME_USECtotal_utResponseTotalTimeUsec41the total time needed to complete a response
ND_ALERT_IDalert_idAlertID42the alert id this event is related to
ND_ALERT_EVENT_IDalert_event_idAlertEventID44a sequential number of the alert transition (per host)
ND_ALERT_UNIQUE_IDalert_unique_idAlertUniqueID43a sequential number of the alert transition (per alert)
ND_ALERT_TRANSITION_IDalert_transition_idAlertTransitionID45the unique UUID of this alert transition
ND_ALERT_CONFIGalert_configAlertConfig46the alert configuration hash (UUID)
ND_ALERT_NAMEalertAlertName47the alert name
ND_ALERT_CLASSalert_classAlertClass48the alert classification
ND_ALERT_COMPONENTalert_componentAlertComponent49the alert component
ND_ALERT_TYPEalert_typeAlertType50the alert type
ND_ALERT_EXECalert_execAlertExec51the alert notification program
ND_ALERT_RECIPIENTalert_recipientAlertRecipient52the alert recipient(s)
ND_ALERT_VALUEalert_valueAlertValue54the current alert value
ND_ALERT_VALUE_OLDalert_value_oldAlertOldValue55the previous alert value
ND_ALERT_STATUSalert_statusAlertStatus56the current alert status
ND_ALERT_STATUS_OLDalert_value_oldAlertOldStatus57the previous alert status
ND_ALERT_UNITSalert_unitsAlertUnits59the units of the alert
ND_ALERT_SUMMARYalert_summaryAlertSummary60the summary text of the alert
ND_ALERT_INFOalert_infoAlertInfo61the info text of the alert
ND_ALERT_DURATIONalert_durationAlertDuration53the duration the alert was in its previous state
ND_ALERT_NOTIFICATION_TIMESTAMP_USECalert_notification_timestampAlertNotificationTimeUsec62the timestamp the notification delivery is scheduled
ND_REQUESTrequestRequest63the full request during which the event happened
MESSAGEmsgMessage64the event message

For wel (Windows Event Logs), all logs have an array of 64 fields strings, and their index number provides their meaning. For etw (Event Tracing for Windows), Netdata logs in a structured way, and field names are available.

Message IDs

Netdata assigns specific message IDs to certain events:

  • ed4cdb8f1beb4ad3b57cb3cae2d162fa when a Netdata child connects to this Netdata
  • 6e2e3839067648968b646045dbf28d66 when this Netdata connects to a Netdata parent
  • 9ce0cb58ab8b44df82c4bf1ad9ee22de when alerts change state
  • 6db0018e83e34320ae2a659d78019fb7 when notifications are sent

You can view these events using the Netdata systemd-journal.plugin at the MESSAGE_ID filter, or using journalctl like this:

# query children connection
journalctl MESSAGE_ID=ed4cdb8f1beb4ad3b57cb3cae2d162fa

# query parent connection
journalctl MESSAGE_ID=6e2e3839067648968b646045dbf28d66

# query alert transitions
journalctl MESSAGE_ID=9ce0cb58ab8b44df82c4bf1ad9ee22de

# query alert notifications
journalctl MESSAGE_ID=6db0018e83e34320ae2a659d78019fb7

Using journalctl to query Netdata logs

The Netdata service's processes execute within the netdata journal namespace. To view the Netdata logs, you should specify the --namespace=netdata option.

# Netdata logs since the last time the service was started
journalctl _SYSTEMD_INVOCATION_ID="$(systemctl show --value --property=InvocationID netdata)" --namespace=netdata

# All netdata logs, the oldest entries are displayed first
journalctl -u netdata --namespace=netdata

# All netdata logs, the newest entries are displayed first
journalctl -u netdata --namespace=netdata -r

Using Event Tracing for Windows (ETW)

ETW requires the publisher Netdata to be registered. Our Windows installer does this automatically.

Registering the publisher is done via a manifest (%SystemRoot%\System32\wevt_netdata_manifest.xml) and its messages resources DLL (%SystemRoot%\System32\wevt_netdata.dll).

If needed, the publisher can be registered and unregistered manually using these commands:

REM register the Netdata publisher
wevtutil im "%SystemRoot%\System32\wevt_netdata_manifest.xml" "/mf:%SystemRoot%\System32\wevt_netdata.dll" "/rf:%SystemRoot%\System32\wevt_netdata.dll"

REM unregister the Netdata publisher
wevtutil um "%SystemRoot%\System32\wevt_netdata_manifest.xml"

The structure of the logs are as follows:

  • Publisher Netdata
    • Channel Netdata/Daemon: general messages about the Netdata service
    • Channel Netdata/Collector: general messages about Netdata external plugins
    • Channel Netdata/Health: alert transitions and general messages generated by Netdata's health engine
    • Channel Netdata/Access: all accesses to Netdata APIs
    • Channel Netdata/Aclk: for Cloud connectivity tracing (disabled by default)

Retention can be configured per Channel via the Event Viewer. Netdata does not set a default, so the system default is used.

IMPORTANT
Event Tracing for Windows (ETW) does not allow logging the percentage character %. The % followed by a number, is recursively used for fields expansion and ETW has not provided any way to escape the character for preventing further expansion.


To work around this limitation, Netdata replaces all `%` which are followed by a number, with `℅` (the Unicode character `care of`). Visually, they look similar, but when copying IPv6 addresses or URLs from the logs, you have to be careful to manually replace `℅` with `%` before using them.

Using Windows Event Logs (WEL)

WEL has a different logs structure and unfortunately WEL and ETW need to use different names if they are to be used concurrently.

For WEL, Netdata logs as follows:

  • Channel NetdataWEL (unfortunately Netdata cannot be used, it conflicts with the ETW Publisher name)
    • Publisher NetdataDaemon: general messages about the Netdata service
    • Publisher NetdataCollector: general messages about Netdata external plugins
    • Publisher NetdataHealth: alert transitions and general messages generated by Netdata's health engine
    • Publisher NetdataAccess: all accesses to Netdata APIs
    • Publisher NetdataAclk: for Cloud connectivity tracing (disabled by default)

Publishers must have unique names system-wide, so we had to prefix them with Netdata.

Retention can be configured per Publisher via the Event Viewer or the Registry. Netdata sets by default 20MiB for all of them, except NetdataAclk (5MiB) and NetdataAccess (35MiB), for a total of 100MiB.

For WEL some registry entries are needed. Netdata automatically takes care of them when it starts.

WEL does not have the problem ETW has with the percent character %, so Netdata logs it as-is.

Differences between ETW and WEL

There are key differences between ETW and WEL.

Publishers and Providers

Publishers are collections of ETW Providers. A Publisher is implied by a manifest file, each of which is considered a Publisher, and each manifest file can define multiple Providers in it. Other than that there is no entity related to Publishers in the system.

Publishers are not defined for WEL.

Providers are the applications or modules logging. Provider names must be unique across the system, for ETW and WEL together.

To define a Provider:

  • ETW requires a Publisher manifest coupled with resources DLLs and must be registered via wevtutil (handled by the Netdata Windows installer automatically).
  • WEL requires some registry entries and a message resources DLL (handled by Netdata automatically on startup).

The Provider appears as Source in the Event Viewer, for both WEL and ETW.

Channels

  • Channels for WEL are collections of WEL Providers, (each WEL Provider is a single Stream of logs).
  • Channels for ETW slice the logs of each Provider into multiple Streams.

WEL Channels cannot have the same name as ETW Providers. This is why Netdata's ETW provider is called Netdata, and WEL channel is called NetdataWEL.

Despite the fact that ETW Publishers and WEL Channels are both collections of Providers, they are not similar. In ETW a Publisher is a collection on the publisher's Providers, but in WEL a Channel may include independent WEL Providers (e.g. the "Applications" Channel). Additionally, WEL Channels cannot include ETW Providers.

Retention

Retention is always defined per Stream.

  • Retention in ETW is defined per ETW Channel (ETW Provider Stream).
  • Retention in WEL is defined per WEL Provider (each WEL Provider is a single Stream).

Messages Formatting

  • ETW supports recursive fields expansion, and therefore %N in fields is expanded recursively (or replaced with an error message if expansion fails). Netdata replaces %N with ℅N to stop recursive expansion (since %N cannot be logged otherwise).
  • WEL performs a single field expansion, and therefore the % character in fields is never expanded.

Usability

  • ETW names all the fields and allows multiple datatypes per field, enabling log consumers to know what each field means and its datatype.
  • WEL uses a simple string table for fields, and consumers need to map these string fields based on their index.

Do you have any feedback for this page? If so, you can open a new issue on our netdata/learn repository.