Azure Service Bus Namespace

Plugin: go.d.plugin Module: azure_monitor

Overview

:::info

This is part of the Azure Monitor collector. No separate setup is needed – a single Azure Monitor job discovers and monitors all supported resource types automatically.

:::

Monitor Azure Service Bus with metrics covering:

Messages – message flow (in/out), active messages, dead-lettered messages, scheduled messages, queue depth
Throughput – data throughput (in/out bytes per second)
Operations – completed and abandoned message operations, send latency
Connections – active connections, connection events (opened/closed)
Requests – incoming and successful request rates
Errors – server errors, user errors, throttled requests
Replication – replication lag (messages and duration)
Resources – namespace size, CPU and memory utilization, pending checkpoint operations

It uses the Azure Monitor Metrics batch API to collect metrics, grouping requests by subscription, region, and time grain. Resources are discovered via Azure Resource Graph queries at startup and refreshed periodically. Authentication is handled through Microsoft Entra ID (service principal, managed identity, or default credentials).

This collector is supported on all platforms.

This collector supports collecting metrics from multiple instances of this integration, including remote instances.

The service principal or managed identity requires these Azure RBAC roles:

Role	Purpose	Scope
Monitoring Reader	Read Azure Monitor metrics	Subscription or resource group
Reader	Query Azure Resource Graph for resource discovery	Subscription or resource group

Default Behavior

Auto-Detection

The collector has two discovery phases:

Bootstrap (first run)

With the default profiles.mode: auto, the collector queries Azure Resource Graph within the configured subscription_ids to find candidate resources.
It matches discovered resource types against built-in profiles and automatically enables the relevant ones.
Discovery scope can be narrowed using discovery.mode: filters (resource groups, regions, tags) or replaced entirely with discovery.mode: query for a custom KQL query.
A single job can monitor multiple subscriptions.

Runtime (periodic refresh)

Periodically re-discovers resources for already-active profile types only.
Controlled by discovery.refresh_every (default: 300 seconds, set to 0 to disable).

Important: Runtime refresh does not activate new profiles. If a new resource type appears after bootstrap, restart the collector to pick it up.

Limits

Minimum collection interval: 60 seconds (enforced). Azure Monitor metrics granularity is typically 1 minute.
Metrics reporting delay: Azure Monitor metrics have a 1-3 minute reporting delay. The collector uses query_offset (default: 180s) as a minimum offset and automatically uses a larger effective offset for slower time-grain batches when needed.
API throttling: Azure Monitor applies per-subscription rate limits. The collector uses bounded concurrency and batching to stay within limits, but monitoring many resources in a single subscription may require tuning limits.* options.

Performance Impact

The collector batches resources and metrics to minimize Azure API calls and uses bounded concurrency to avoid overwhelming the API.

Default concurrency and batching limits:

Setting	Default	Description
`limits.max_concurrency`	4	Maximum concurrent batch queries
`limits.max_batch_resources`	50	Maximum resources per batch request
`limits.max_metrics_per_query`	20	Maximum metrics per batch request

For large deployments, consider splitting resources across multiple jobs. If you hit Azure API rate limits, reduce max_concurrency.

Setup

You can configure the azure_monitor collector in two ways:

Method	Best for	How to
UI	Fast setup without editing files	Go to Nodes → Configure this node → Collectors → Jobs, search for azure_monitor, then click + to add a job.
File	If you prefer configuring via file, or need to automate deployments (e.g., with Ansible)	Edit `go.d/azure_monitor.conf` and add a job.

:::important

UI configuration requires paid Netdata Cloud plan.

:::

Prerequisites

Create an Azure monitoring principal

The collector requires a service principal or managed identity with two Azure RBAC roles:

Role	Purpose
Monitoring Reader	Access Azure Monitor metrics for target resources
Reader	Query Azure Resource Graph for resource discovery

Option A: Service principal

# Create service principal with Monitoring Reader role
az ad sp create-for-rbac --name "netdata-monitor" --role "Monitoring Reader" \
  --scopes /subscriptions/<subscription-id>

# Add the Reader role for resource discovery
az role assignment create --assignee <appId-from-above> \
  --role "Reader" --scope /subscriptions/<subscription-id>

# Note the appId (client_id), password (client_secret), and tenant

Option B: Managed identity (Azure VMs, VMSS, or AKS)

# Assign both roles to the VM's managed identity
az role assignment create --assignee <managed-identity-principal-id> \
  --role "Monitoring Reader" --scope /subscriptions/<subscription-id>

az role assignment create --assignee <managed-identity-principal-id> \
  --role "Reader" --scope /subscriptions/<subscription-id>

Configuration

Options

The following options can be defined globally: update_every, autodetection_retry.

Profile file locations:

Type	Path
Stock profiles	`/usr/lib/netdata/conf.d/go.d/azure_monitor.profiles/default/`
User overrides	`/etc/netdata/go.d/azure_monitor.profiles/`

User profile files with the same id as a stock profile override it. Custom profiles extend the collector’s catalog – they do not replace the discovery mechanism.

Group	Option	Description	Default	Required
Collection	update_every	Data collection interval (seconds). Must be at least 60.	60	no
	autodetection_retry	Autodetection retry interval (seconds). Set 0 to disable.	0	no
	subscription_ids	List of Azure subscription IDs to monitor. Used as the scope for resource discovery.		yes
	cloud	Azure cloud environment: `public`, `government`, or `china`.	public	no
	query_offset	Minimum offset (seconds) subtracted from metric query windows. Increase if metrics appear incomplete.	180	no
	timeout	Timeout for Azure Resource Graph and Azure Monitor API requests, in seconds.	30	no
Authentication	auth.mode	Authentication method: `service_principal`, `managed_identity`, or `default`.		yes
	auth.mode_service_principal.tenant_id	Entra ID tenant ID (required for `service_principal` mode).		no
	auth.mode_service_principal.client_id	Entra ID application (client) ID (required for `service_principal` mode).		no
	auth.mode_service_principal.client_secret	Entra ID client secret (required for `service_principal` mode).		no
	auth.mode_managed_identity.client_id	Client ID for user-assigned managed identity. Leave empty for system-assigned.		no
Discovery	discovery.refresh_every	Interval (seconds) for refreshing discovered resources. Set `0` to disable runtime re-discovery after bootstrap.	300	no
	discovery.mode	Resource discovery method: `filters` (structured filters) or `query` (custom KQL).	filters	no
	discovery.mode_filters.resource_groups	Optional list of Azure resource groups to include in `filters` mode.	[]	no
	discovery.mode_filters.regions	Optional list of Azure regions to include in `filters` mode.	[]	no
	discovery.mode_filters.tags	Optional exact-match tag filters for `filters` mode. Keys are matched case-insensitively and values case-sensitively.	{}	no
	discovery.mode_query.kql	Custom Azure Resource Graph KQL for `query` mode. Must project `id`, `name`, `type`, `resourceGroup`, `location`.		no
Profiles	profiles.mode	How profiles are selected: `auto` (discover from resources), `exact` (explicit list), or `combined` (both).	auto	no
	profiles.mode_exact.names	Explicit profile file basenames used by `exact` mode. Matching is case-insensitive.	[]	no
	profiles.mode_combined.names	Explicit profile file basenames merged with auto-discovered profiles in `combined` mode. Matching is case-insensitive.	[]	no
Limits	limits.max_concurrency	Maximum concurrent batch queries to Azure Monitor.	4	no
	limits.max_batch_resources	Maximum resources per Azure Monitor batch request.	50	no
	limits.max_metrics_per_query	Maximum metrics per Azure Monitor batch request.	20	no
Virtual Node	vnode	Associates this data collection job with a Virtual Node.		no

query_offset

Azure Monitor metrics have a built-in reporting delay of 1-3 minutes. The collector subtracts this offset from the current time when building metric query windows to avoid fetching incomplete data points.

The configured query_offset acts as a minimum floor. For slower metric batches, the collector automatically uses a larger effective offset when the batch time grain is longer than the configured value.

Default (180s) works for most services.
Longer time grains (for example PT5M) automatically use at least one full time grain as the effective offset.
Increase to 240-300s if you still see gaps or missing data points.
Do not set below 60s – metrics will likely be incomplete.

auth.mode

Determines how the collector authenticates with Azure.

Mode	When to use	Required options
`service_principal`	Running outside Azure, or when you need explicit credentials	`tenant_id`, `client_id`, `client_secret`
`managed_identity`	Running on Azure VMs, VMSS, or AKS with a managed identity	Optionally `client_id` for user-assigned identity
`default`	Uses the Azure SDK default credential chain (environment variables, managed identity, Azure CLI, etc.)	None

discovery.mode

Controls how the collector finds candidate Azure resources.

Mode	Behavior
`filters`	Builds an Azure Resource Graph query from the structured `mode_filters.*` options (resource groups, regions, tags). This is the default.
`query`	Uses the raw KQL you provide in `discovery.mode_query.kql`. The query must project `id`, `name`, `type`, `resourceGroup`, and `location`.

discovery.mode_query.kql

A raw Azure Resource Graph KQL query used when discovery.mode is query.

The query must project these five columns:

Column	Description
`id`	Full Azure resource ID (ARM format)
`name`	Resource name
`type`	Resource type (e.g., `microsoft.sql/servers/databases`)
`resourceGroup`	Resource group name
`location`	Azure region

Example:

resources
| where tags.env =~ "prod"
| project id, name, type, resourceGroup, location

profiles.mode

Controls how the collector decides which metric profiles to activate.

Mode	Behavior
`auto`	Discovers resource types in your subscriptions and enables matching built-in profiles automatically. This is the default.
`exact`	Uses only the profile basenames listed under `profiles.mode_exact.names`. No auto-discovery.
`combined`	Merges auto-discovered profiles with the basenames listed under `profiles.mode_combined.names`.

Profile basename matching is case-insensitive. A basename is the profile filename without the .yaml / .yml suffix.

via UI

Configure the azure_monitor collector from the Netdata web interface:

Go to Nodes.
Select the node where you want the azure_monitor data-collection job to run and click the :gear: (Configure this node). That node will run the data collection.
The Collectors → Jobs view opens by default.
In the Search box, type azure_monitor (or scroll the list) to locate the azure_monitor collector.
Click the + next to the azure_monitor collector to add a new job.
Fill in the job fields, then click Test to verify the configuration and Submit to save.
- Test runs the job with the provided settings and shows whether data can be collected.
- If it fails, an error message appears with details (for example, connection refused, timeout, or command execution errors), so you can adjust and retest.

via File

The configuration file name for this integration is go.d/azure_monitor.conf.

The file format is YAML. Generally, the structure is:

update_every: 1
autodetection_retry: 0
jobs:
  - name: some_name1
  - name: some_name2

You can edit the configuration file using the edit-config script from the Netdata config directory.

cd /etc/netdata 2>/dev/null || cd /opt/netdata/etc/netdata
sudo ./edit-config go.d/azure_monitor.conf

Examples

Service principal with structured discovery

Authenticate with a service principal and auto-discover resources across two subscriptions, filtered to the production-rg resource group in eastus with the tag env=prod.

jobs:
  - name: prod
    subscription_ids:
      - "aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"
      - "bbbbbbbb-bbbb-bbbb-bbbb-bbbbbbbbbbbb"
    discovery:
      mode: filters
      mode_filters:
        resource_groups:
          - production-rg
        regions:
          - eastus
        tags:
          env:
            - prod
    profiles:
      mode: auto
    auth:
      mode: service_principal
      mode_service_principal:
        tenant_id: "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
        client_id: "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
        client_secret: "your-client-secret"

Managed identity with exact profiles

Use a managed identity (on an Azure VM, VMSS, or AKS) and monitor only SQL Database and PostgreSQL Flexible Server resources – skip auto-discovery of other services.

jobs:
  - name: databases
    subscription_ids:
      - "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
    profiles:
      mode: exact
      mode_exact:
        names:
          - sql_database
          - postgres_flexible
    auth:
      mode: managed_identity

Custom Azure Resource Graph KQL

Replace the built-in discovery filters with your own KQL query. Useful when you need joins, computed columns, or filtering logic that structured filters cannot express.

jobs:
  - name: prod-query
    subscription_ids:
      - "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
    discovery:
      mode: query
      mode_query:
        kql: |
          resources
          | where tags.env =~ "prod"
          | project id, name, type, resourceGroup, location          
    profiles:
      mode: auto
    auth:
      mode: default

Azure Government cloud

Connect to an Azure Government environment. Set cloud: government to use the correct authentication and API endpoints.

jobs:
  - name: gov
    subscription_ids:
      - "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
    cloud: government
    auth:
      mode: service_principal
      mode_service_principal:
        tenant_id: "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
        client_id: "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
        client_secret: "your-client-secret"

Metrics

Metrics grouped by scope.

The scope defines the instance that the metric belongs to. An instance is uniquely identified by a set of labels.

Per resource

These metrics refer to each monitored Azure resource.

Labels:

Label	Description
resource_name	The Azure resource name.
resource_group	The Azure resource group.
region	The Azure region where the resource is deployed.
resource_type	The Azure resource type identifier.
profile	The Azure Monitor profile id.
subscription_id	The Azure subscription identifier.
resource_uid	The unique Azure resource identifier.

Metrics:

Metric	Dimensions	Unit
azure_monitor.service_bus.message_flow	in, out	messages/s
azure_monitor.service_bus.message_operations	completed, abandoned	messages/s
azure_monitor.service_bus.queue_depth	active	messages
azure_monitor.service_bus.problem_messages	dead_lettered, scheduled	messages
azure_monitor.service_bus.requests	incoming, successful	requests/s
azure_monitor.service_bus.errors	server, user, throttled	errors/s
azure_monitor.service_bus.connections	active	connections
azure_monitor.service_bus.connection_events	opened, closed	connections
azure_monitor.service_bus.namespace_size	average	bytes
azure_monitor.service_bus.data_throughput	in, out	bytes/s
azure_monitor.service_bus.total_messages	total	messages
azure_monitor.service_bus.namespace_resources	cpu, memory	percentage
azure_monitor.service_bus.send_latency	average	milliseconds
azure_monitor.service_bus.replication_lag	messages	messages
azure_monitor.service_bus.replication_lag_duration	duration	seconds
azure_monitor.service_bus.checkpoint_operations	pending	operations

Alerts

The following alerts are available:

Alert name	On metric	Description
am_service_bus_server_errors	azure_monitor.service_bus.errors	Service Bus server errors on ${label:resource_name}
am_service_bus_throttled_requests	azure_monitor.service_bus.errors	Service Bus throttled requests on ${label:resource_name}
am_service_bus_user_errors	azure_monitor.service_bus.errors	Service Bus user errors on ${label:resource_name}
am_service_bus_namespace_cpu	azure_monitor.service_bus.namespace_resources	Service Bus namespace CPU on ${label:resource_name}
am_service_bus_namespace_memory	azure_monitor.service_bus.namespace_resources	Service Bus namespace memory on ${label:resource_name}
am_service_bus_send_latency	azure_monitor.service_bus.send_latency	Service Bus send latency on ${label:resource_name}
am_service_bus_dead_lettered_messages	azure_monitor.service_bus.problem_messages	Service Bus dead-lettered messages on ${label:resource_name}
am_service_bus_active_messages	azure_monitor.service_bus.queue_depth	Service Bus queue depth on ${label:resource_name}
am_service_bus_request_success_rate	azure_monitor.service_bus.requests	Service Bus request success rate on ${label:resource_name}
am_service_bus_abandoned_messages	azure_monitor.service_bus.message_operations	Service Bus abandoned messages on ${label:resource_name}
am_service_bus_replication_lag	azure_monitor.service_bus.replication_lag	Service Bus replication lag on ${label:resource_name}
am_service_bus_replication_lag_duration	azure_monitor.service_bus.replication_lag_duration	Service Bus replication lag duration on ${label:resource_name}

Troubleshooting

Debug Mode

Important: Debug mode is not supported for data collection jobs created via the UI using the Dyncfg feature.

To troubleshoot issues with the azure_monitor collector, run the go.d.plugin with the debug option enabled. The output should give you clues as to why the collector isn’t working.

Navigate to the plugins.d directory, usually at /usr/libexec/netdata/plugins.d/. If that’s not the case on your system, open netdata.conf and look for the plugins setting under [directories].
```
cd /usr/libexec/netdata/plugins.d/
```
Switch to the netdata user.
```
sudo -u netdata -s
```

Run the go.d.plugin to debug the collector:

./go.d.plugin -d -m azure_monitor

To debug a specific job:

./go.d.plugin -d -m azure_monitor -j jobName

Getting Logs

If you’re encountering problems with the azure_monitor collector, follow these steps to retrieve logs and identify potential issues:

Run the command specific to your system (systemd, non-systemd, or Docker container).
Examine the output for any warnings or error messages that might indicate issues. These messages should provide clues about the root cause of the problem.

System with systemd

Use the following command to view logs generated since the last Netdata service restart:

journalctl _SYSTEMD_INVOCATION_ID="$(systemctl show --value --property=InvocationID netdata)" --namespace=netdata --grep azure_monitor

System without systemd

Locate the collector log file, typically at /var/log/netdata/collector.log, and use grep to filter for collector’s name:

grep azure_monitor /var/log/netdata/collector.log

Note: This method shows logs from all restarts. Focus on the latest entries for troubleshooting current issues.

Docker Container

If your Netdata runs in a Docker container named “netdata” (replace if different), use this command:

docker logs netdata 2>&1 | grep azure_monitor

No metrics are collected

Check the following:

Permissions – The principal has both Monitoring Reader and Reader roles on the target subscription.
Subscription IDs – The subscription_ids list includes the correct subscription(s).
Resources are active – Verify in Azure Portal > Metrics that the resources are producing metrics.

Collector logs – Check for authentication or API errors:

# systemd
journalctl -u netdata --namespace=netdata --grep azure_monitor --since "5 minutes ago"
# non-systemd
grep azure_monitor /var/log/netdata/collector.log

Missing metrics for some resource types

Profiles are matched by Azure resource type. If a resource type exists but metrics are missing:

Check profile mode – Ensure profiles.mode: auto (default), or explicitly list the profile basename under profiles.mode_exact.names or profiles.mode_combined.names.

Verify a built-in profile exists – List available profiles:

ls /usr/lib/netdata/conf.d/go.d/azure_monitor.profiles/default/

Check resource activity – Some metrics only appear when the resource is actively processing data (e.g., IoT Hub telemetry metrics require devices to be sending messages).
New resource types after startup – Runtime discovery does not activate new profiles. Restart the collector if new resource types were added after bootstrap.

Charts have gaps or incomplete data

Azure Monitor metrics have a built-in reporting delay of 1-3 minutes.

The collector uses query_offset (default: 180 seconds) as the minimum offset for metric query windows.
Slower time-grain batches automatically use a larger effective offset when needed.
If metrics are still missing or incomplete, increase query_offset to 240 or 300 seconds.

Authentication errors in sovereign clouds

For Azure Government or Azure China clouds, set the cloud parameter:

Azure Government: cloud: government
Azure China (21Vianet): cloud: china

Ensure the service principal is registered in the correct cloud tenant.