Plugin: go.d.plugin Module: mssql
This collector monitors the health and performance of Microsoft SQL Server instances.
It collects metrics from:
It connects to the SQL Server instance via TCP using the go-mssqldb driver and executes queries against:
sys.dm_os_performance_counters - Performance counter valuessys.dm_exec_sessions - Connection informationsys.dm_os_wait_stats - Wait statisticssys.dm_tran_locks - Lock informationsys.dm_io_virtual_file_stats - I/O stall (latency) statisticssys.dm_os_process_memory - SQL Server process memorysys.dm_os_sys_memory - OS physical memory and page filesys.master_files - Database file sizesmsdb.dbo.sysjobs - SQL Agent job statussys.dm_hadr_availability_group_states - AG health rollupsys.dm_hadr_availability_replica_states - Replica operational statesys.dm_hadr_database_replica_states - Database sync queues and ratessys.dm_hadr_cluster / sys.dm_hadr_cluster_members - WSFC cluster healthsys.dm_hadr_database_replica_cluster_states - Failover readinesssys.dm_hadr_auto_page_repair - Automatic page repair eventssys.dm_hadr_ag_threads - AG thread usage (SQL Server 2019+)This collector is supported on all platforms.
This collector supports collecting metrics from multiple instances of this integration, including remote instances.
The monitoring user requires the VIEW SERVER STATE permission to access DMVs.
SQL Agent job monitoring is part of collector startup, so access to
msdb.dbo.sysjobs is required.
Always On AG monitoring requires VIEW ANY DEFINITION for access to availability group catalog views.
On SQL Server 2022+, HADR DMVs may additionally require VIEW SERVER PERFORMANCE STATE.
Microsoft SQL Server can be monitored further using the following other integrations:
By default, it tries to connect to SQL Server on localhost:1433 without authentication. You must configure proper credentials for monitoring.
The default configuration for this integration does not impose any limits on data collection.
The collector executes lightweight queries against system views. Most queries complete in milliseconds and have minimal impact on server performance.
You can configure the mssql collector in two ways:
| Method | Best for | How to |
|---|---|---|
| UI | Fast setup without editing files | Go to Nodes → Configure this node → Collectors → Jobs, search for mssql, then click + to add a job. |
| File | If you prefer configuring via file, or need to automate deployments (e.g., with Ansible) | Edit go.d/mssql.conf and add a job. |
:::important
UI configuration requires paid Netdata Cloud plan.
:::
Create a SQL Server login with VIEW SERVER STATE permission:
-- Create login
CREATE LOGIN netdata_user WITH PASSWORD = 'YourStrongPassword!';
-- Grant VIEW SERVER STATE (required for DMVs)
GRANT VIEW SERVER STATE TO netdata_user;
-- Grant VIEW ANY DEFINITION (required for Always On AG monitoring)
GRANT VIEW ANY DEFINITION TO netdata_user;
-- Grant VIEW SERVER PERFORMANCE STATE (required for HADR DMVs on SQL Server 2022+)
-- GRANT VIEW SERVER PERFORMANCE STATE TO netdata_user;
-- Grant access to msdb for SQL Agent job monitoring (required)
USE msdb;
CREATE USER netdata_user FOR LOGIN netdata_user;
GRANT SELECT ON dbo.sysjobs TO netdata_user;
-- Optional: Grant access to distribution database for replication monitoring
-- (only if replication is configured)
USE distribution;
CREATE USER netdata_user FOR LOGIN netdata_user;
GRANT SELECT ON dbo.MSreplication_monitordata TO netdata_user;
GRANT SELECT ON dbo.MSpublications TO netdata_user;
GRANT SELECT ON dbo.MSsubscriptions TO netdata_user;
Required permissions:
VIEW SERVER STATE - Access to dynamic management viewsSELECT on msdb.dbo.sysjobs - SQL Agent job status monitoringOptional permissions:
VIEW ANY DEFINITION - Always On Availability Group monitoringVIEW SERVER PERFORMANCE STATE - HADR DMVs on SQL Server 2022+SELECT on distribution.dbo.MSreplication_monitordata - Replication monitoringSELECT on distribution.dbo.MSpublications - Publication informationSELECT on distribution.dbo.MSsubscriptions - Subscription countsThe following options can be defined globally: update_every, autodetection_retry.
| Group | Option | Description | Default | Required |
|---|---|---|---|---|
| Collection | update_every | Data collection interval (seconds). | 10 | no |
| autodetection_retry | Autodetection retry interval (seconds). Set 0 to disable. | 0 | no | |
| Target | dsn | SQL Server DSN (Data Source Name). See DSN syntax. When cloud_auth.provider is azure_ad, use URL format with sqlserver:// scheme. | sqlserver://localhost:1433 | yes |
| Cloud Auth | cloud_auth.provider | Cloud auth provider (none or azure_ad). | none | no |
| Cloud Auth/Azure | cloud_auth.azure_ad.mode | Azure AD credential mode (service_principal, managed_identity, or default). | default | no |
| cloud_auth.azure_ad.tenant_id | Azure tenant ID. Required for service_principal mode. | no | ||
| cloud_auth.azure_ad.client_id | Azure client ID. Required for service_principal; optional for user-assigned managed identity. | no | ||
| cloud_auth.azure_ad.client_secret | Azure client secret for service_principal mode. | no | ||
| cloud_auth.azure_ad.managed_identity_client_id | Optional client ID of a user-assigned managed identity (managed_identity mode). | no | ||
| Target | timeout | Query timeout (seconds). | 5 | no |
| Functions | functions.top_queries.disabled | Disable the top-queries function. | no | no |
| functions.top_queries.timeout | Query timeout for top-queries function (seconds). Uses collector timeout if not set. | no | ||
| functions.top_queries.limit | Maximum number of queries to return in the top-queries response. | 500 | no | |
| functions.top_queries.time_window_days | Number of days of Query Store data to analyze. Set to 0 to include all available data. Smaller values improve query performance but show less history. | 7 | no | |
| functions.deadlock_info.disabled | Disable the deadlock-info function. | no | no | |
| functions.deadlock_info.timeout | Query timeout for deadlock-info function (seconds). Uses collector timeout if not set. | no | ||
| functions.deadlock_info.use_ring_buffer | Use ring_buffer instead of event_file for system_health session. WARNING: Not recommended for production: • Data cleared on failover/restart • 4 MB capacity limit • High CPU load during queries Use only for Azure SQL Database without Blob Storage or testing. | no | no | |
| functions.error_info.disabled | Disable the error-info function. | no | no | |
| functions.error_info.timeout | Query timeout for error-info function (seconds). Uses collector timeout if not set. | no | ||
| functions.error_info.session_name | Extended Events session name capturing error_reported events. Must be created by administrator with event_file (recommended) or ring_buffer target. | netdata_errors | no | |
| functions.error_info.use_ring_buffer | Use ring_buffer instead of event_file for error events. WARNING: Not recommended for production: • Data cleared on failover/restart • 4 MB capacity limit • High CPU load during queries Use only for Azure SQL Database without Blob Storage or testing. | no | no | |
| Virtual Node | vnode | Associates this data collection job with a Virtual Node. | no |
Configure the mssql collector from the Netdata web interface:
The configuration file name for this integration is go.d/mssql.conf.
The file format is YAML. Generally, the structure is:
update_every: 1
autodetection_retry: 0
jobs:
- name: some_name1
- name: some_name2
You can edit the configuration file using the edit-config script from the
Netdata config directory.
cd /etc/netdata 2>/dev/null || cd /opt/netdata/etc/netdata
sudo ./edit-config go.d/mssql.conf
Connect to local SQL Server with SQL authentication.
jobs:
- name: local
dsn: "sqlserver://netdata_user:password@localhost:1433"
Connect using Windows integrated authentication.
jobs:
- name: local
dsn: "sqlserver://localhost:1433?trusted_connection=yes"
Connect to a named SQL Server instance.
jobs:
- name: named_instance
dsn: "sqlserver://netdata_user:password@localhost/INSTANCENAME"
Connect to a remote SQL Server.
jobs:
- name: remote
dsn: "sqlserver://netdata_user:password@192.168.1.100:1433"
Use Microsoft Entra service principal authentication for Azure SQL.
jobs:
- name: azure_sql_sp
dsn: "sqlserver://my-server.database.windows.net:1433?database=mydb"
cloud_auth:
provider: azure_ad
azure_ad:
mode: service_principal
tenant_id: "00000000-0000-0000-0000-000000000000"
client_id: "11111111-1111-1111-1111-111111111111"
client_secret: "super-secret-value"
Use managed identity authentication (system-assigned by default).
jobs:
- name: azure_sql_mi
dsn: "sqlserver://my-server.database.windows.net:1433?database=mydb"
cloud_auth:
provider: azure_ad
azure_ad:
mode: managed_identity
Note: When you define multiple jobs, their names must be unique.
Monitoring multiple SQL Server instances.
jobs:
- name: production
dsn: "sqlserver://netdata_user:password@prod-sql:1433"
- name: development
dsn: "sqlserver://netdata_user:password@dev-sql:1433"
Configure function-specific settings like timeouts and limits.
Warning: Query Store may contain unmasked literal values (PII). Disable functions if not needed or ensure proper access controls.
jobs:
- name: local
dsn: "sqlserver://netdata_user:password@localhost:1433"
functions:
top_queries:
limit: 100
time_window_days: 7
deadlock_info:
use_ring_buffer: true
error_info:
session_name: custom_errors
Metrics grouped by scope.
The scope defines the instance that the metric belongs to. An instance is uniquely identified by a set of labels.
These metrics refer to the entire SQL Server instance.
This scope has no labels.
Metrics:
| Metric | Dimensions | Unit | SQL Server 2016+ | Azure SQL Database |
|---|---|---|---|---|
| mssql.user_connections | user | connections | • | • |
| mssql.session_connections | user, internal | connections | • | • |
| mssql.blocked_processes | blocked | processes | • | • |
| mssql.batch_requests | batch | requests/s | • | • |
| mssql.compilations | compilations | compilations/s | • | • |
| mssql.recompilations | recompilations | recompilations/s | • | • |
| mssql.auto_param_attempts | total, safe, failed | attempts/s | • | • |
| mssql.sql_errors | errors | errors/s | • | • |
| mssql.buffer_cache_hit_ratio | hit_ratio | percentage | • | • |
| mssql.buffer_page_life_expectancy | life_expectancy | seconds | • | • |
| mssql.buffer_page_iops | read, written | pages/s | • | • |
| mssql.buffer_checkpoint_pages | flushed | pages/s | • | • |
| mssql.buffer_page_lookups | lookups | lookups/s | • | • |
| mssql.buffer_lazy_writes | lazy_writes | writes/s | • | • |
| mssql.memory_total | memory | bytes | • | • |
| mssql.memory_connection | memory | bytes | • | • |
| mssql.memory_pending_grants | pending | processes | • | • |
| mssql.memory_external_benefit | benefit | benefit | • | • |
| mssql.page_splits | page | splits/s | • | • |
| mssql.process_memory_resident | resident | bytes | • | • |
| mssql.process_memory_virtual | virtual | bytes | • | • |
| mssql.process_memory_utilization | utilization | percentage | • | • |
| mssql.process_page_faults | page_faults | faults | • | • |
| mssql.os_memory | used, available | bytes | • | • |
| mssql.os_pagefile | used, available | bytes | • | • |
These metrics refer to individual databases.
Labels:
| Label | Description |
|---|---|
| database | Database name |
Metrics:
| Metric | Dimensions | Unit | SQL Server 2016+ | Azure SQL Database |
|---|---|---|---|---|
| mssql.database_active_transactions | active | transactions | • | • |
| mssql.database_transactions | transactions | transactions/s | • | • |
| mssql.database_write_transactions | write | transactions/s | • | • |
| mssql.database_log_flushes | flushes | flushes/s | • | • |
| mssql.database_log_flushed | flushed | bytes/s | • | • |
| mssql.database_log_growths | growths | growths | • | • |
| mssql.database_io_stall | read, write | ms | • | • |
| mssql.database_data_file_size | size | bytes | • | • |
| mssql.database_backup_restore_throughput | throughput | bytes/s | • | • |
| mssql.database_state | online, restoring, recovering, pending, suspect, emergency, offline | state | • | • |
| mssql.database_read_only | read_only, read_write | status | • | • |
These metrics refer to lock statistics by lock resource type (from performance counters).
Labels:
| Label | Description |
|---|---|
| resource | Lock resource type (Database, File, Object, Page, Key, Extent, RID, HoBT, etc.) |
Metrics:
| Metric | Dimensions | Unit | SQL Server 2016+ | Azure SQL Database |
|---|---|---|---|---|
| mssql.lock_stats_deadlocks | deadlocks | deadlocks/s | • | • |
| mssql.lock_stats_waits | waits | waits/s | • | • |
| mssql.lock_stats_timeouts | timeouts | timeouts/s | • | • |
| mssql.lock_stats_requests | requests | requests/s | • | • |
These metrics refer to lock resource types (from sys.dm_tran_locks).
Labels:
| Label | Description |
|---|---|
| resource | Lock resource type (Database, File, Object, Page, Key, etc.) |
Metrics:
| Metric | Dimensions | Unit | SQL Server 2016+ | Azure SQL Database |
|---|---|---|---|---|
| mssql.locks_by_resource | locks | locks | • | • |
These metrics refer to individual wait types (from sys.dm_os_wait_stats).
Labels:
| Label | Description |
|---|---|
| wait_type | Wait type name |
| wait_category | Wait category (CPU, Lock, Latch, Buffer IO, etc.) |
Metrics:
| Metric | Dimensions | Unit | SQL Server 2016+ | Azure SQL Database |
|---|---|---|---|---|
| mssql.wait_total_time | duration | ms | • | • |
| mssql.wait_resource_time | duration | ms | • | • |
| mssql.wait_signal_time | duration | ms | • | • |
| mssql.wait_max_time | max_time | ms | • | • |
| mssql.wait_count | waits | waits/s | • | • |
These metrics refer to SQL Server Agent jobs.
Labels:
| Label | Description |
|---|---|
| job_name | Job name |
Metrics:
| Metric | Dimensions | Unit | SQL Server 2016+ | Azure SQL Database |
|---|---|---|---|---|
| mssql.job_status | enabled, disabled | status | • | • |
These metrics refer to SQL Server replication publications.
Labels:
| Label | Description |
|---|---|
| publisher_db | Publisher database name |
| publication | Publication name |
Metrics:
| Metric | Dimensions | Unit | SQL Server 2016+ | Azure SQL Database |
|---|---|---|---|---|
| mssql.replication_status | started, succeeded, in_progress, idle, retrying, failed | status | • | • |
| mssql.replication_warning | expiration, latency, merge_expiration, merge_slow_duration, merge_fast_duration, merge_fast_speed, merge_slow_speed | flags | • | • |
| mssql.replication_latency | average, best, worst | seconds | • | • |
| mssql.replication_subscriptions | total, agents_running | subscriptions | • | • |
These metrics refer to Always On Availability Groups. Auto-detected when HADR is enabled.
Labels:
| Label | Description |
|---|---|
| ag_name | Availability group name |
Metrics:
| Metric | Dimensions | Unit | SQL Server 2016+ | Azure SQL Database |
|---|---|---|---|---|
| mssql.ag_sync_health | not_healthy, partially_healthy, healthy | state | • | • |
| mssql.ag_recovery_health | primary_online, primary_in_progress, secondary_online, secondary_in_progress | state | • | • |
| mssql.ag_threads | capture, redo, parallel_redo | threads | • | • |
These metrics refer to per-replica state within an Availability Group. Note: on secondary replicas, the replica states DMV returns only local information.
Labels:
| Label | Description |
|---|---|
| ag_name | Availability group name |
| replica_server | Replica server name |
| availability_mode | Availability mode (synchronous_commit or asynchronous_commit) |
| failover_mode | Failover mode (automatic or manual) |
Metrics:
| Metric | Dimensions | Unit | SQL Server 2016+ | Azure SQL Database |
|---|---|---|---|---|
| mssql.ag_replica_role | primary, secondary, resolving, unknown | state | • | • |
| mssql.ag_replica_connected_state | connected, disconnected, unknown | state | • | • |
| mssql.ag_replica_sync_health | not_healthy, partially_healthy, healthy | state | • | • |
These metrics refer to per-database synchronization within an Availability Group.
Labels:
| Label | Description |
|---|---|
| ag_name | Availability group name |
| replica_server | Replica server name |
| database | Database name |
Metrics:
| Metric | Dimensions | Unit | SQL Server 2016+ | Azure SQL Database |
|---|---|---|---|---|
| mssql.ag_db_sync_state | not_synchronizing, synchronizing, synchronized, reverting, initializing | state | • | • |
| mssql.ag_db_log_send_queue | queue_size | bytes | • | • |
| mssql.ag_db_log_send_rate | send_rate | bytes/s | • | • |
| mssql.ag_db_redo_queue | queue_size | bytes | • | • |
| mssql.ag_db_redo_rate | redo_rate | bytes/s | • | • |
| mssql.ag_db_filestream_send_rate | send_rate | bytes/s | • | • |
| mssql.ag_db_secondary_lag | lag | seconds | • | • |
| mssql.ag_db_suspended | active, suspended | state | • | • |
| mssql.ag_db_failover_readiness | ready, not_ready | state | • | • |
| mssql.ag_db_joined_state | joined, not_joined | state | • | • |
These metrics refer to the Windows Server Failover Clustering quorum state.
This scope has no labels.
Metrics:
| Metric | Dimensions | Unit | SQL Server 2016+ | Azure SQL Database |
|---|---|---|---|---|
| mssql.ag_cluster_quorum_state | normal, forced, unknown | state | • | • |
These metrics refer to individual WSFC cluster members.
Labels:
| Label | Description |
|---|---|
| cluster_member | Cluster member name |
Metrics:
| Metric | Dimensions | Unit | SQL Server 2016+ | Azure SQL Database |
|---|---|---|---|---|
| mssql.ag_cluster_member_state | up, down | state | • | • |
| mssql.ag_cluster_member_quorum_votes | votes | votes | • | • |
These metrics refer to automatic page repair events per database in an Availability Group.
Labels:
| Label | Description |
|---|---|
| database | Database name |
Metrics:
| Metric | Dimensions | Unit | SQL Server 2016+ | Azure SQL Database |
|---|---|---|---|---|
| mssql.ag_page_repair | successful, failed | repairs | • | • |
There are no alerts configured by default for this integration.
Important: Debug mode is not supported for data collection jobs created via the UI using the Dyncfg feature.
To troubleshoot issues with the mssql collector, run the go.d.plugin with the debug option enabled. The output
should give you clues as to why the collector isn’t working.
Navigate to the plugins.d directory, usually at /usr/libexec/netdata/plugins.d/. If that’s not the case on
your system, open netdata.conf and look for the plugins setting under [directories].
cd /usr/libexec/netdata/plugins.d/
Switch to the netdata user.
sudo -u netdata -s
Run the go.d.plugin to debug the collector:
./go.d.plugin -d -m mssql
To debug a specific job:
./go.d.plugin -d -m mssql -j jobName
If you’re encountering problems with the mssql collector, follow these steps to retrieve logs and identify potential issues:
Use the following command to view logs generated since the last Netdata service restart:
journalctl _SYSTEMD_INVOCATION_ID="$(systemctl show --value --property=InvocationID netdata)" --namespace=netdata --grep mssql
Locate the collector log file, typically at /var/log/netdata/collector.log, and use grep to filter for collector’s name:
grep mssql /var/log/netdata/collector.log
Note: This method shows logs from all restarts. Focus on the latest entries for troubleshooting current issues.
If your Netdata runs in a Docker container named “netdata” (replace if different), use this command:
docker logs netdata 2>&1 | grep mssql
Ensure SQL Server is running and accepting TCP connections on the configured port. Check that the SQL Server Browser service is running if using named instances.
Verify the username and password in the DSN are correct. Ensure SQL Server is configured for mixed mode authentication if using SQL logins.
The monitoring user needs VIEW SERVER STATE permission.
Grant it with: GRANT VIEW SERVER STATE TO netdata_user;
Want a personalised demo of Netdata for your use case?