CockroachDB
Plugin: go.d.plugin Module: cockroachdb
Overview
This collector monitors CockroachDB servers.
It scrapes Prometheus metrics from the CockroachDB /_status/vars endpoint.
It also provides top-queries and running-queries functions using SQL statement statistics (crdb_internal.cluster_statement_statistics) and the SHOW CLUSTER STATEMENTS command.
This collector is supported on all platforms.
This collector supports collecting metrics from multiple instances of this integration, including remote instances.
The top-queries and running-queries functions require:
- A SQL user with
VIEWACTIVITYorVIEWACTIVITYREDACTEDprivileges. - Access to
crdb_internal.cluster_statement_statistics(may requireSET allow_unsafe_internals = onon newer versions).
Default Behavior
Auto-Detection
This integration doesn't support auto-detection.
Limits
The default configuration for this integration does not impose any limits on data collection.
Performance Impact
The default configuration for this integration is not expected to impose a significant performance impact on the system.
Metrics
Metrics grouped by scope.
The scope defines the instance that the metric belongs to. An instance is uniquely identified by a set of labels.
Per CockroachDB instance
These metrics refer to the entire monitored application.
This scope has no labels.
Metrics:
| Metric | Dimensions | Unit |
|---|---|---|
| cockroachdb.process_cpu_time_combined_percentage | used | percentage |
| cockroachdb.process_cpu_time_percentage | user, sys | percentage |
| cockroachdb.process_cpu_time | user, sys | ms |
| cockroachdb.process_memory | rss | KiB |
| cockroachdb.process_file_descriptors | open | fd |
| cockroachdb.process_uptime | uptime | seconds |
| cockroachdb.host_disk_bandwidth | read, write | KiB |
| cockroachdb.host_disk_operations | reads, writes | operations |
| cockroachdb.host_disk_iops_in_progress | in_progress | iops |
| cockroachdb.host_network_bandwidth | received, sent | kilobits |
| cockroachdb.host_network_packets | received, sent | packets |
| cockroachdb.live_nodes | live_nodes | nodes |
| cockroachdb.node_liveness_heartbeats | successful, failed | heartbeats |
| cockroachdb.total_storage_capacity | total | KiB |
| cockroachdb.storage_capacity_usability | usable, unusable | KiB |
| cockroachdb.storage_usable_capacity | available, used | KiB |
| cockroachdb.storage_used_capacity_percentage | total, usable | percentage |
| cockroachdb.sql_connections | active | connections |
| cockroachdb.sql_bandwidth | received, sent | KiB |
| cockroachdb.sql_statements_total | started, executed | statements |
| cockroachdb.sql_errors | statement, transaction | errors |
| cockroachdb.sql_started_ddl_statements | ddl | statements |
| cockroachdb.sql_executed_ddl_statements | ddl | statements |
| cockroachdb.sql_started_dml_statements | select, update, delete, insert | statements |
| cockroachdb.sql_executed_dml_statements | select, update, delete, insert | statements |
| cockroachdb.sql_started_tcl_statements | begin, commit, rollback, savepoint, savepoint_cockroach_restart, release_savepoint_cockroach_restart, rollback_to_savepoint_cockroach_restart | statements |
| cockroachdb.sql_executed_tcl_statements | begin, commit, rollback, savepoint, savepoint_cockroach_restart, release_savepoint_cockroach_restart, rollback_to_savepoint_cockroach_restart | statements |
| cockroachdb.sql_active_distributed_queries | active | queries |
| cockroachdb.sql_distributed_flows | active, queued | flows |
| cockroachdb.live_bytes | applications, system | KiB |
| cockroachdb.logical_data | keys, values | KiB |
| cockroachdb.logical_data_count | keys, values | num |
| cockroachdb.kv_transactions | committed, fast-path_committed, aborted | transactions |
| cockroachdb.kv_transaction_restarts | write_too_old, write_too_old_multiple, forwarded_timestamp, possible_reply, async_consensus_failure, read_within_uncertainty_interval, aborted, push_failure, unknown | restarts |
| cockroachdb.ranges | ranges | ranges |
| cockroachdb.ranges_replication_problem | unavailable, under_replicated, over_replicated | ranges |
| cockroachdb.range_events | split, add, remove, merge | events |
| cockroachdb.range_snapshot_events | generated, applied_raft_initiated, applied_learner, applied_preemptive | events |
| cockroachdb.rocksdb_read_amplification | reads | reads/query |
| cockroachdb.rocksdb_table_operations | compactions, flushes | operations |
| cockroachdb.rocksdb_cache_usage | used | KiB |
| cockroachdb.rocksdb_cache_operations | hits, misses | operations |
| cockroachdb.rocksdb_cache_hit_rate | hit_rate | percentage |
| cockroachdb.rocksdb_sstables | sstables | sstables |
| cockroachdb.replicas | replicas | replicas |
| cockroachdb.replicas_quiescence | quiescent, active | replicas |
| cockroachdb.replicas_leaders | leaders, not_leaseholders | replicas |
| cockroachdb.replicas_leaseholders | leaseholders | leaseholders |
| cockroachdb.queue_processing_failures | gc, replica_gc, replication, split, consistency, raft_log, raft_snapshot, time_series_maintenance | failures |
| cockroachdb.rebalancing_queries | avg | queries/s |
| cockroachdb.rebalancing_writes | avg | writes/s |
| cockroachdb.timeseries_samples | written | samples |
| cockroachdb.timeseries_write_errors | write | errors |
| cockroachdb.timeseries_write_bytes | written | KiB |
| cockroachdb.slow_requests | acquiring_latches, acquiring_lease, in_raft | requests |
| cockroachdb.code_heap_memory_usage | go, cgo | KiB |
| cockroachdb.goroutines | goroutines | goroutines |
| cockroachdb.gc_count | gc | invokes |
| cockroachdb.gc_pause | pause | us |
| cockroachdb.cgo_calls | cgo | calls |
Functions
This collector exposes real-time functions for interactive troubleshooting in the Top tab.
Top Queries
Retrieves and aggregates SQL statement performance metrics from CockroachDB crdb_internal.cluster_statement_statistics table.
This function queries cluster-wide statement statistics grouped by fingerprint (normalized query pattern). It provides aggregated metrics including execution counts, timing breakdowns, and row operation statistics.
Use cases:
- Identify slow queries consuming the most total execution time
- Find frequently executed queries that may benefit from optimization
- Analyze row read/write patterns to detect inefficient queries
Query text is truncated at 4096 characters for display purposes.
| Aspect | Description |
|---|---|
| Name | Cockroachdb:top-queries |
| Require Cloud | yes |
| Performance | Queries the crdb_internal.cluster_statement_statistics table which aggregates data across the cluster:• On busy clusters with high query throughput, this query may take longer • Default limit of 500 rows balances usefulness with performance |
| Security | Query text may contain unmasked literal values including potentially sensitive data: • Personal information in WHERE clauses or INSERT values • Business data and internal identifiers • Access should be restricted to authorized personnel only |
| Availability | Available when: • The collector has successfully connected to CockroachDB • The SQL user has VIEWACTIVITY or VIEWACTIVITYREDACTED privileges• Returns HTTP 503 if the SQL connection cannot be established • Returns HTTP 500 if the query fails • Returns HTTP 504 if the query times out |
Prerequisites
Grant VIEWACTIVITY access to cluster statement stats
The SQL user must have appropriate privileges to access statement statistics.
-
Grant
VIEWACTIVITY(shows full query text) orVIEWACTIVITYREDACTED(masks literals):GRANT SYSTEM VIEWACTIVITY TO netdata_user;
-- OR for privacy:
GRANT SYSTEM VIEWACTIVITYREDACTED TO netdata_user; -
On newer CockroachDB versions, access to
crdb_internalmay require:SET allow_unsafe_internals = on;
- The collector automatically sets
allow_unsafe_internals = onfor the session when queryingcrdb_internaltables (required on newer versions) VIEWACTIVITYREDACTEDreplaces literal values with underscores for privacy- Statement statistics are collected by default but can be disabled via cluster settings
Parameters
| Parameter | Type | Description | Required | Default | Options |
|---|---|---|---|---|---|
| Filter By | select | Select the primary sort column. Options include total time, executions, rows read, rows written, and more. Defaults to total time to focus on most resource-intensive queries. | yes | totalTime |
Returns
Aggregated SQL statement statistics grouped by fingerprint. Each row represents a unique query pattern with cumulative metrics across all executions.
| Column | Type | Unit | Visibility | Description |
|---|---|---|---|---|
| Fingerprint ID | string | hidden | Unique hash identifier for the normalized query pattern. Queries with identical structure but different literal values share the same fingerprint. | |
| Query | string | Normalized SQL statement text with literals replaced. Truncated to 4096 characters. | ||
| Database | string | Database name where the query was executed. Empty for queries without database context. | ||
| Application | string | Application name that executed the query. Useful for identifying query sources across services. | ||
| Statement Type | string | hidden | Type of SQL statement (SELECT, INSERT, UPDATE, DELETE, etc.). | |
| Distributed | string | hidden | Whether the query used DistSQL execution (true/false). Distributed queries span multiple nodes. | |
| Full Scan | string | hidden | Whether the query performed a full table scan (true/false). Full scans may indicate missing indexes. | |
| Implicit Txn | string | hidden | Whether the statement ran in an implicit transaction (true/false). | |
| Vectorized | string | hidden | Whether the query used vectorized execution (true/false). Vectorized execution improves performance for analytical queries. | |
| Executions | integer | Total number of times this query pattern has been executed. High values indicate frequently run queries. | ||
| Total Time | duration | milliseconds | Cumulative service latency across all executions (mean time × executions). High values indicate queries consuming significant cluster resources. | |
| Mean Time | duration | milliseconds | Average service latency per execution. Use this to compare typical performance across query patterns. | |
| Run Time | duration | milliseconds | hidden | Average time spent executing the query after planning. Excludes parse and plan time. |
| Plan Time | duration | milliseconds | hidden | Average time spent generating the query execution plan. High values may indicate complex queries or stale statistics. |
| Parse Time | duration | milliseconds | hidden | Average time spent parsing the SQL statement. |
| Rows Read | integer | Total rows read across all executions. High values relative to rows returned suggest missing indexes or inefficient scans. | ||
| Rows Written | integer | Total rows written across all executions. Indicates write workload for INSERT, UPDATE, DELETE statements. | ||
| Rows Returned | integer | Total rows returned to clients across all executions. Compare with rows read to assess query efficiency. | ||
| Bytes Read | integer | hidden | Total bytes read from storage across all executions. Indicates I/O load for the query pattern. | |
| Max Retries | integer | hidden | Maximum number of automatic retries observed for this query pattern. High values indicate transaction contention. |
Running Queries
Retrieves currently executing SQL statements across the CockroachDB cluster using SHOW CLUSTER STATEMENTS.
This function provides a real-time snapshot of all active queries across all nodes in the cluster, including their execution phase, duration, and associated metadata.
Use cases:
- Identify long-running queries that may be blocking other operations
- Monitor active workload distribution across the cluster
- Debug stuck or slow queries in real-time
Query text is truncated at 4096 characters for display purposes.
| Aspect | Description |
|---|---|
| Name | Cockroachdb:running-queries |
| Require Cloud | yes |
| Performance | Executes the SHOW CLUSTER STATEMENTS command which queries all nodes in the cluster:• Lightweight operation with minimal overhead • Returns only currently active queries, typically a small result set |
| Security | Query text may contain unmasked literal values including potentially sensitive data: • Personal information in WHERE clauses or VALUES • Session tokens or credentials • Access should be restricted to authorized personnel only |
| Availability | Available when: • The collector has successfully connected to CockroachDB • The SQL user has VIEWACTIVITY or VIEWACTIVITYREDACTED privileges• Returns HTTP 503 if the SQL connection cannot be established • Returns HTTP 500 if the query fails • Returns HTTP 504 if the query times out |
Prerequisites
Grant VIEWACTIVITY access to system tables
The SQL user must have appropriate privileges to view running statements.
-
Grant
VIEWACTIVITY(shows full query text) orVIEWACTIVITYREDACTED(masks literals):GRANT SYSTEM VIEWACTIVITY TO netdata_user;
-- OR for privacy:
GRANT SYSTEM VIEWACTIVITYREDACTED TO netdata_user;infoSHOW CLUSTER STATEMENTSshows queries across all nodes, not just the connected nodeVIEWACTIVITYREDACTEDreplaces literal values with underscores for privacy- Queries shown are point-in-time snapshots and may complete between retrieval and display
Parameters
| Parameter | Type | Description | Required | Default | Options |
|---|---|---|---|---|---|
| Filter By | select | Select the primary sort column. Defaults to elapsed time to show longest-running queries first. | yes | elapsedMs |
Returns
Real-time snapshot of currently executing SQL statements across all cluster nodes. Each row represents a single active query.
| Column | Type | Unit | Visibility | Description |
|---|---|---|---|---|
| Query ID | string | hidden | Unique identifier for this specific query execution. Can be used with CANCEL QUERY if needed. | |
| Query | string | The SQL statement currently being executed. Truncated to 4096 characters. | ||
| User | string | Database user executing the query. Useful for identifying workload by user. | ||
| Application | string | Application name from the client connection. Helps identify which service is running the query. | ||
| Client Address | string | hidden | IP address of the client connection. Useful for identifying query sources. | |
| Node ID | string | hidden | CockroachDB node currently executing the query. Helps identify workload distribution. | |
| Session ID | string | hidden | Session identifier for the connection. Multiple queries may share a session. | |
| Phase | string | Current execution phase (executing, preparing, etc.). Indicates query progress. | ||
| Distributed | string | hidden | Whether the query is using distributed execution across multiple nodes. | |
| Start Time | string | hidden | Timestamp when the query started executing. | |
| Elapsed | duration | milliseconds | Time elapsed since query started. High values indicate long-running queries that may need investigation. |
Alerts
The following alerts are available:
| Alert name | On metric | Description |
|---|---|---|
| cockroachdb_used_storage_capacity | cockroachdb.storage_used_capacity_percentage | storage capacity utilization |
| cockroachdb_used_usable_storage_capacity | cockroachdb.storage_used_capacity_percentage | storage usable space utilization |
| cockroachdb_unavailable_ranges | cockroachdb.ranges_replication_problem | number of ranges with fewer live replicas than needed for quorum |
| cockroachdb_underreplicated_ranges | cockroachdb.ranges_replication_problem | number of ranges with fewer live replicas than the replication target |
| cockroachdb_open_file_descriptors_limit | cockroachdb.process_file_descriptors | open file descriptors utilization (against softlimit) |
Setup
You can configure the cockroachdb collector in two ways:
| Method | Best for | How to |
|---|---|---|
| UI | Fast setup without editing files | Go to Nodes → Configure this node → Collectors → Jobs, search for cockroachdb, then click + to add a job. |
| File | If you prefer configuring via file, or need to automate deployments (e.g., with Ansible) | Edit go.d/cockroachdb.conf and add a job. |
UI configuration requires paid Netdata Cloud plan.
Prerequisites
No action required.
Configuration
Options
The following options can be defined globally: update_every, autodetection_retry.
Config options
| Group | Option | Description | Default | Required |
|---|---|---|---|---|
| Collection | update_every | Data collection interval (seconds). | 10 | no |
| autodetection_retry | Autodetection retry interval (seconds). Set 0 to disable. | 0 | no | |
| Target | url | Target endpoint URL. | http://127.0.0.1:8080/_status/vars | yes |
| timeout | HTTP request timeout (seconds). | 1 | no | |
| Query Functions | dsn | SQL DSN used by top-queries and running-queries functions. | no | |
| sql_timeout | SQL query timeout (seconds) for query functions. | 1 | no | |
| Limits | top_queries_limit | Maximum number of rows returned by the top-queries and running-queries functions. | 500 | no |
| HTTP Auth | username | Username for Basic HTTP authentication. | no | |
| password | Password for Basic HTTP authentication. | no | ||
| bearer_token_file | Path to a file containing a bearer token (used for Authorization: Bearer). | no | ||
| TLS | tls_skip_verify | Skip TLS certificate and hostname verification (insecure). | no | no |
| tls_ca | Path to CA bundle used to validate the server certificate. | no | ||
| tls_cert | Path to client TLS certificate (for mTLS). | no | ||
| tls_key | Path to client TLS private key (for mTLS). | no | ||
| Proxy | proxy_url | HTTP proxy URL. | no | |
| proxy_username | Username for proxy Basic HTTP authentication. | no | ||
| proxy_password | Password for proxy Basic HTTP authentication. | no | ||
| Request | method | HTTP method to use. | GET | no |
| body | Request body (e.g., for POST/PUT). | no | ||
| headers | Additional HTTP headers (one per line as key: value). | no | ||
| not_follow_redirects | Do not follow HTTP redirects. | no | no | |
| force_http2 | Force HTTP/2 (including h2c over TCP). | no | no | |
| Virtual Node | vnode | Associates this data collection job with a Virtual Node. | no |
via UI
Configure the cockroachdb collector from the Netdata web interface:
- Go to Nodes.
- Select the node where you want the cockroachdb data-collection job to run and click the ⚙ (Configure this node). That node will run the data collection.
- The Collectors → Jobs view opens by default.
- In the Search box, type cockroachdb (or scroll the list) to locate the cockroachdb collector.
- Click the + next to the cockroachdb collector to add a new job.
- Fill in the job fields, then click Test to verify the configuration and Submit to save.
- Test runs the job with the provided settings and shows whether data can be collected.
- If it fails, an error message appears with details (for example, connection refused, timeout, or command execution errors), so you can adjust and retest.
via File
The configuration file name for this integration is go.d/cockroachdb.conf.
The file format is YAML. Generally, the structure is:
update_every: 1
autodetection_retry: 0
jobs:
- name: some_name1
- name: some_name2
You can edit the configuration file using the edit-config script from the
Netdata config directory.
cd /etc/netdata 2>/dev/null || cd /opt/netdata/etc/netdata
sudo ./edit-config go.d/cockroachdb.conf
Examples
Basic
An example configuration.
Config
jobs:
- name: local
url: http://127.0.0.1:8080/_status/vars
Top queries
Enable SQL query functions.
Config
jobs:
- name: local
url: http://127.0.0.1:8080/_status/vars
dsn: postgres://[email protected]:26257/defaultdb?sslmode=disable
HTTP authentication
Local server with basic HTTP authentication.
Config
jobs:
- name: local
url: http://127.0.0.1:8080/_status/vars
username: username
password: password
HTTPS with self-signed certificate
CockroachDB with enabled HTTPS and self-signed certificate.
Config
jobs:
- name: local
url: https://127.0.0.1:8080/_status/vars
tls_skip_verify: yes
Multi-instance
Note: When you define multiple jobs, their names must be unique.
Collecting metrics from local and remote instances.
Config
jobs:
- name: local
url: http://127.0.0.1:8080/_status/vars
- name: remote
url: http://203.0.113.10:8080/_status/vars
Troubleshooting
Debug Mode
Important: Debug mode is not supported for data collection jobs created via the UI using the Dyncfg feature.
To troubleshoot issues with the cockroachdb collector, run the go.d.plugin with the debug option enabled. The output
should give you clues as to why the collector isn't working.
-
Navigate to the
plugins.ddirectory, usually at/usr/libexec/netdata/plugins.d/. If that's not the case on your system, opennetdata.confand look for thepluginssetting under[directories].cd /usr/libexec/netdata/plugins.d/ -
Switch to the
netdatauser.sudo -u netdata -s -
Run the
go.d.pluginto debug the collector:./go.d.plugin -d -m cockroachdbTo debug a specific job:
./go.d.plugin -d -m cockroachdb -j jobName
Getting Logs
If you're encountering problems with the cockroachdb collector, follow these steps to retrieve logs and identify potential issues:
- Run the command specific to your system (systemd, non-systemd, or Docker container).
- Examine the output for any warnings or error messages that might indicate issues. These messages should provide clues about the root cause of the problem.
System with systemd
Use the following command to view logs generated since the last Netdata service restart:
journalctl _SYSTEMD_INVOCATION_ID="$(systemctl show --value --property=InvocationID netdata)" --namespace=netdata --grep cockroachdb
System without systemd
Locate the collector log file, typically at /var/log/netdata/collector.log, and use grep to filter for collector's name:
grep cockroachdb /var/log/netdata/collector.log
Note: This method shows logs from all restarts. Focus on the latest entries for troubleshooting current issues.
Docker Container
If your Netdata runs in a Docker container named "netdata" (replace if different), use this command:
docker logs netdata 2>&1 | grep cockroachdb
Do you have any feedback for this page? If so, you can open a new issue on our netdata/learn repository.