Skip to main content

Azure Event Hubs Namespace

Plugin: go.d.plugin Module: azure_monitor

Overview

info

This is part of the Azure Monitor collector. No separate setup is needed -- a single Azure Monitor job discovers and monitors all supported resource types automatically.

Monitor Azure Event Hubs with metrics covering:

  • Messages -- message flow (in/out), captured messages and bytes
  • Throughput -- data throughput (in/out bytes per second)
  • Connections -- active connections, connection events (opened/closed)
  • Requests -- incoming and successful request rates
  • Errors -- server errors, user errors, throttled requests, quota exceeded
  • Replication -- replication lag (messages and duration)
  • Resources -- namespace size, CPU and memory utilization

It uses the Azure Monitor Metrics batch API to collect metrics, grouping requests by subscription, region, and time grain. Resources are discovered via Azure Resource Graph queries at startup and refreshed periodically. Authentication is handled through Microsoft Entra ID (service principal, managed identity, or default credentials).

This collector is supported on all platforms.

This collector supports collecting metrics from multiple instances of this integration, including remote instances.

The service principal or managed identity requires these Azure RBAC roles:

RolePurposeScope
Monitoring ReaderRead Azure Monitor metricsSubscription or resource group
ReaderQuery Azure Resource Graph for resource discoverySubscription or resource group

Default Behavior

Auto-Detection

The collector has two discovery phases:

Bootstrap (first run)

  • With the default profiles.mode: auto, the collector queries Azure Resource Graph within the configured subscription_ids to find candidate resources.
  • It matches discovered resource types against built-in profiles and automatically enables the relevant ones.
  • Discovery scope can be narrowed using discovery.mode: filters (resource groups, regions, tags) or replaced entirely with discovery.mode: query for a custom KQL query.
  • A single job can monitor multiple subscriptions.

Runtime (periodic refresh)

  • Periodically re-discovers resources for already-active profile types only.
  • Controlled by discovery.refresh_every (default: 300 seconds, set to 0 to disable).

Important: Runtime refresh does not activate new profiles. If a new resource type appears after bootstrap, restart the collector to pick it up.

Limits

  • Minimum collection interval: 60 seconds (enforced). Azure Monitor metrics granularity is typically 1 minute.
  • Metrics reporting delay: Azure Monitor metrics have a 1-3 minute reporting delay. The collector uses query_offset (default: 180s) as a minimum offset and automatically uses a larger effective offset for slower time-grain batches when needed.
  • API throttling: Azure Monitor applies per-subscription rate limits. The collector uses bounded concurrency and batching to stay within limits, but monitoring many resources in a single subscription may require tuning limits.* options.

Performance Impact

The collector batches resources and metrics to minimize Azure API calls and uses bounded concurrency to avoid overwhelming the API.

Default concurrency and batching limits:

SettingDefaultDescription
limits.max_concurrency4Maximum concurrent batch queries
limits.max_batch_resources50Maximum resources per batch request
limits.max_metrics_per_query20Maximum metrics per batch request

For large deployments, consider splitting resources across multiple jobs. If you hit Azure API rate limits, reduce max_concurrency.

Setup

You can configure the azure_monitor collector in two ways:

MethodBest forHow to
UIFast setup without editing filesGo to Nodes → Configure this node → Collectors → Jobs, search for azure_monitor, then click + to add a job.
FileIf you prefer configuring via file, or need to automate deployments (e.g., with Ansible)Edit go.d/azure_monitor.conf and add a job.
important

UI configuration requires paid Netdata Cloud plan.

Prerequisites

Create an Azure monitoring principal

The collector requires a service principal or managed identity with two Azure RBAC roles:

RolePurpose
Monitoring ReaderAccess Azure Monitor metrics for target resources
ReaderQuery Azure Resource Graph for resource discovery

Option A: Service principal

# Create service principal with Monitoring Reader role
az ad sp create-for-rbac --name "netdata-monitor" --role "Monitoring Reader" \
--scopes /subscriptions/<subscription-id>

# Add the Reader role for resource discovery
az role assignment create --assignee <appId-from-above> \
--role "Reader" --scope /subscriptions/<subscription-id>

# Note the appId (client_id), password (client_secret), and tenant

Option B: Managed identity (Azure VMs, VMSS, or AKS)

# Assign both roles to the VM's managed identity
az role assignment create --assignee <managed-identity-principal-id> \
--role "Monitoring Reader" --scope /subscriptions/<subscription-id>

az role assignment create --assignee <managed-identity-principal-id> \
--role "Reader" --scope /subscriptions/<subscription-id>

Configuration

Options

The following options can be defined globally: update_every, autodetection_retry.

Profile file locations:

TypePath
Stock profiles/usr/lib/netdata/conf.d/go.d/azure_monitor.profiles/default/
User overrides/etc/netdata/go.d/azure_monitor.profiles/

User profile files with the same id as a stock profile override it. Custom profiles extend the collector's catalog -- they do not replace the discovery mechanism.

Config options
GroupOptionDescriptionDefaultRequired
Collectionupdate_everyData collection interval (seconds). Must be at least 60.60no
autodetection_retryAutodetection retry interval (seconds). Set 0 to disable.0no
subscription_idsList of Azure subscription IDs to monitor. Used as the scope for resource discovery.yes
cloudAzure cloud environment: public, government, or china.publicno
query_offsetMinimum offset (seconds) subtracted from metric query windows. Increase if metrics appear incomplete.180no
timeoutTimeout for Azure Resource Graph and Azure Monitor API requests, in seconds.30no
Authenticationauth.modeAuthentication method: service_principal, managed_identity, or default.yes
auth.mode_service_principal.tenant_idEntra ID tenant ID (required for service_principal mode).no
auth.mode_service_principal.client_idEntra ID application (client) ID (required for service_principal mode).no
auth.mode_service_principal.client_secretEntra ID client secret (required for service_principal mode).no
auth.mode_managed_identity.client_idClient ID for user-assigned managed identity. Leave empty for system-assigned.no
Discoverydiscovery.refresh_everyInterval (seconds) for refreshing discovered resources. Set 0 to disable runtime re-discovery after bootstrap.300no
discovery.modeResource discovery method: filters (structured filters) or query (custom KQL).filtersno
discovery.mode_filters.resource_groupsOptional list of Azure resource groups to include in filters mode.[]no
discovery.mode_filters.regionsOptional list of Azure regions to include in filters mode.[]no
discovery.mode_filters.tagsOptional exact-match tag filters for filters mode. Keys are matched case-insensitively and values case-sensitively.{}no
discovery.mode_query.kqlCustom Azure Resource Graph KQL for query mode. Must project id, name, type, resourceGroup, location.no
Profilesprofiles.modeHow profiles are selected: auto (discover from resources), exact (explicit list), or combined (both).autono
profiles.mode_exact.namesExplicit profile file basenames used by exact mode. Matching is case-insensitive.[]no
profiles.mode_combined.namesExplicit profile file basenames merged with auto-discovered profiles in combined mode. Matching is case-insensitive.[]no
Limitslimits.max_concurrencyMaximum concurrent batch queries to Azure Monitor.4no
limits.max_batch_resourcesMaximum resources per Azure Monitor batch request.50no
limits.max_metrics_per_queryMaximum metrics per Azure Monitor batch request.20no
Virtual NodevnodeAssociates this data collection job with a Virtual Node.no
query_offset

Azure Monitor metrics have a built-in reporting delay of 1-3 minutes. The collector subtracts this offset from the current time when building metric query windows to avoid fetching incomplete data points.

The configured query_offset acts as a minimum floor. For slower metric batches, the collector automatically uses a larger effective offset when the batch time grain is longer than the configured value.

  • Default (180s) works for most services.
  • Longer time grains (for example PT5M) automatically use at least one full time grain as the effective offset.
  • Increase to 240-300s if you still see gaps or missing data points.
  • Do not set below 60s -- metrics will likely be incomplete.
auth.mode

Determines how the collector authenticates with Azure.

ModeWhen to useRequired options
service_principalRunning outside Azure, or when you need explicit credentialstenant_id, client_id, client_secret
managed_identityRunning on Azure VMs, VMSS, or AKS with a managed identityOptionally client_id for user-assigned identity
defaultUses the Azure SDK default credential chain (environment variables, managed identity, Azure CLI, etc.)None
discovery.mode

Controls how the collector finds candidate Azure resources.

ModeBehavior
filtersBuilds an Azure Resource Graph query from the structured mode_filters.* options (resource groups, regions, tags). This is the default.
queryUses the raw KQL you provide in discovery.mode_query.kql. The query must project id, name, type, resourceGroup, and location.
discovery.mode_query.kql

A raw Azure Resource Graph KQL query used when discovery.mode is query.

The query must project these five columns:

ColumnDescription
idFull Azure resource ID (ARM format)
nameResource name
typeResource type (e.g., microsoft.sql/servers/databases)
resourceGroupResource group name
locationAzure region

Example:

resources
| where tags.env =~ "prod"
| project id, name, type, resourceGroup, location
profiles.mode

Controls how the collector decides which metric profiles to activate.

ModeBehavior
autoDiscovers resource types in your subscriptions and enables matching built-in profiles automatically. This is the default.
exactUses only the profile basenames listed under profiles.mode_exact.names. No auto-discovery.
combinedMerges auto-discovered profiles with the basenames listed under profiles.mode_combined.names.

Profile basename matching is case-insensitive. A basename is the profile filename without the .yaml / .yml suffix.

via UI

Configure the azure_monitor collector from the Netdata web interface:

  1. Go to Nodes.
  2. Select the node where you want the azure_monitor data-collection job to run and click the (Configure this node). That node will run the data collection.
  3. The Collectors → Jobs view opens by default.
  4. In the Search box, type azure_monitor (or scroll the list) to locate the azure_monitor collector.
  5. Click the + next to the azure_monitor collector to add a new job.
  6. Fill in the job fields, then click Test to verify the configuration and Submit to save.
    • Test runs the job with the provided settings and shows whether data can be collected.
    • If it fails, an error message appears with details (for example, connection refused, timeout, or command execution errors), so you can adjust and retest.

via File

The configuration file name for this integration is go.d/azure_monitor.conf.

The file format is YAML. Generally, the structure is:

update_every: 1
autodetection_retry: 0
jobs:
- name: some_name1
- name: some_name2

You can edit the configuration file using the edit-config script from the Netdata config directory.

cd /etc/netdata 2>/dev/null || cd /opt/netdata/etc/netdata
sudo ./edit-config go.d/azure_monitor.conf
Examples
Service principal with structured discovery

Authenticate with a service principal and auto-discover resources across two subscriptions, filtered to the production-rg resource group in eastus with the tag env=prod.

jobs:
- name: prod
subscription_ids:
- "aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"
- "bbbbbbbb-bbbb-bbbb-bbbb-bbbbbbbbbbbb"
discovery:
mode: filters
mode_filters:
resource_groups:
- production-rg
regions:
- eastus
tags:
env:
- prod
profiles:
mode: auto
auth:
mode: service_principal
mode_service_principal:
tenant_id: "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
client_id: "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
client_secret: "your-client-secret"

Managed identity with exact profiles

Use a managed identity (on an Azure VM, VMSS, or AKS) and monitor only SQL Database and PostgreSQL Flexible Server resources -- skip auto-discovery of other services.

Config
jobs:
- name: databases
subscription_ids:
- "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
profiles:
mode: exact
mode_exact:
names:
- sql_database
- postgres_flexible
auth:
mode: managed_identity

Custom Azure Resource Graph KQL

Replace the built-in discovery filters with your own KQL query. Useful when you need joins, computed columns, or filtering logic that structured filters cannot express.

Config
jobs:
- name: prod-query
subscription_ids:
- "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
discovery:
mode: query
mode_query:
kql: |
resources
| where tags.env =~ "prod"
| project id, name, type, resourceGroup, location
profiles:
mode: auto
auth:
mode: default

Azure Government cloud

Connect to an Azure Government environment. Set cloud: government to use the correct authentication and API endpoints.

Config
jobs:
- name: gov
subscription_ids:
- "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
cloud: government
auth:
mode: service_principal
mode_service_principal:
tenant_id: "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
client_id: "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
client_secret: "your-client-secret"

Alerts

The following alerts are available:

Alert nameOn metricDescription
am_event_hubs_server_errors azure_monitor.event_hubs.errorsEvent Hubs server errors on ${label:resource_name}
am_event_hubs_throttled_requests azure_monitor.event_hubs.errorsEvent Hubs throttled requests on ${label:resource_name}
am_event_hubs_quota_exceeded azure_monitor.event_hubs.errorsEvent Hubs quota exceeded on ${label:resource_name}
am_event_hubs_user_errors azure_monitor.event_hubs.errorsEvent Hubs user errors on ${label:resource_name}
am_event_hubs_success_rate azure_monitor.event_hubs.requestsEvent Hubs request success rate on ${label:resource_name}
am_event_hubs_namespace_cpu azure_monitor.event_hubs.namespace_resourcesEvent Hubs namespace CPU on ${label:resource_name}
am_event_hubs_namespace_memory azure_monitor.event_hubs.namespace_resourcesEvent Hubs namespace memory on ${label:resource_name}
am_event_hubs_capture_backlog azure_monitor.event_hubs.capture_backlogEvent Hubs capture backlog on ${label:resource_name}
am_event_hubs_replication_lag azure_monitor.event_hubs.replication_lagEvent Hubs replication lag on ${label:resource_name}
am_event_hubs_replication_lag_duration azure_monitor.event_hubs.replication_lag_durationEvent Hubs replication lag duration on ${label:resource_name}

Metrics

Metrics grouped by scope.

The scope defines the instance that the metric belongs to. An instance is uniquely identified by a set of labels.

Per resource

These metrics refer to each monitored Azure resource.

Labels:

LabelDescription
resource_nameThe Azure resource name.
resource_groupThe Azure resource group.
regionThe Azure region where the resource is deployed.
resource_typeThe Azure resource type identifier.
profileThe Azure Monitor profile id.
subscription_idThe Azure subscription identifier.
resource_uidThe unique Azure resource identifier.

Metrics:

MetricDimensionsUnit
azure_monitor.event_hubs.message_flowin, outmessages/s
azure_monitor.event_hubs.data_throughputin, outbytes/s
azure_monitor.event_hubs.requestsincoming, successfulrequests/s
azure_monitor.event_hubs.errorsserver, user, throttled, quota_exceedederrors/s
azure_monitor.event_hubs.connectionsactiveconnections
azure_monitor.event_hubs.connection_eventsopened, closedconnections
azure_monitor.event_hubs.captured_messagestotalmessages/s
azure_monitor.event_hubs.captured_datatotalbytes/s
azure_monitor.event_hubs.namespace_sizeaveragebytes
azure_monitor.event_hubs.capture_backlogbacklogmessages
azure_monitor.event_hubs.namespace_resourcescpu, memorypercentage
azure_monitor.event_hubs.replication_lagmessagesmessages
azure_monitor.event_hubs.replication_lag_durationdurationseconds

Troubleshooting

Debug Mode

Important: Debug mode is not supported for data collection jobs created via the UI using the Dyncfg feature.

To troubleshoot issues with the azure_monitor collector, run the go.d.plugin with the debug option enabled. The output should give you clues as to why the collector isn't working.

  • Navigate to the plugins.d directory, usually at /usr/libexec/netdata/plugins.d/. If that's not the case on your system, open netdata.conf and look for the plugins setting under [directories].

    cd /usr/libexec/netdata/plugins.d/
  • Switch to the netdata user.

    sudo -u netdata -s
  • Run the go.d.plugin to debug the collector:

    ./go.d.plugin -d -m azure_monitor

    To debug a specific job:

    ./go.d.plugin -d -m azure_monitor -j jobName

Getting Logs

If you're encountering problems with the azure_monitor collector, follow these steps to retrieve logs and identify potential issues:

  • Run the command specific to your system (systemd, non-systemd, or Docker container).
  • Examine the output for any warnings or error messages that might indicate issues. These messages should provide clues about the root cause of the problem.

System with systemd

Use the following command to view logs generated since the last Netdata service restart:

journalctl _SYSTEMD_INVOCATION_ID="$(systemctl show --value --property=InvocationID netdata)" --namespace=netdata --grep azure_monitor

System without systemd

Locate the collector log file, typically at /var/log/netdata/collector.log, and use grep to filter for collector's name:

grep azure_monitor /var/log/netdata/collector.log

Note: This method shows logs from all restarts. Focus on the latest entries for troubleshooting current issues.

Docker Container

If your Netdata runs in a Docker container named "netdata" (replace if different), use this command:

docker logs netdata 2>&1 | grep azure_monitor

No metrics are collected

Check the following:

  • Permissions -- The principal has both Monitoring Reader and Reader roles on the target subscription.
  • Subscription IDs -- The subscription_ids list includes the correct subscription(s).
  • Resources are active -- Verify in Azure Portal > Metrics that the resources are producing metrics.
  • Collector logs -- Check for authentication or API errors:
    # systemd
    journalctl -u netdata --namespace=netdata --grep azure_monitor --since "5 minutes ago"
    # non-systemd
    grep azure_monitor /var/log/netdata/collector.log

Missing metrics for some resource types

Profiles are matched by Azure resource type. If a resource type exists but metrics are missing:

  • Check profile mode -- Ensure profiles.mode: auto (default), or explicitly list the profile basename under profiles.mode_exact.names or profiles.mode_combined.names.
  • Verify a built-in profile exists -- List available profiles:
    ls /usr/lib/netdata/conf.d/go.d/azure_monitor.profiles/default/
  • Check resource activity -- Some metrics only appear when the resource is actively processing data (e.g., IoT Hub telemetry metrics require devices to be sending messages).
  • New resource types after startup -- Runtime discovery does not activate new profiles. Restart the collector if new resource types were added after bootstrap.

Charts have gaps or incomplete data

Azure Monitor metrics have a built-in reporting delay of 1-3 minutes.

  • The collector uses query_offset (default: 180 seconds) as the minimum offset for metric query windows.
  • Slower time-grain batches automatically use a larger effective offset when needed.
  • If metrics are still missing or incomplete, increase query_offset to 240 or 300 seconds.

Authentication errors in sovereign clouds

For Azure Government or Azure China clouds, set the cloud parameter:

  • Azure Government: cloud: government
  • Azure China (21Vianet): cloud: china

Ensure the service principal is registered in the correct cloud tenant.


Do you have any feedback for this page? If so, you can open a new issue on our netdata/learn repository.