Skip to main content

ClickHouse

Plugin: go.d.plugin Module: clickhouse

Overview

This collector retrieves performance data from ClickHouse for connections, queries, resources, replication, IO, and data operations (inserts, selects, merges) using HTTP requests and ClickHouse system tables. It monitors your ClickHouse server's health and activity.

It sends HTTP requests to the ClickHouse HTTP interface, executing SELECT queries to retrieve data from various system tables. Specifically, it collects metrics from the following tables:

  • system.metrics
  • system.async_metrics
  • system.events
  • system.disks
  • system.parts
  • system.processes

This collector is supported on all platforms.

This collector supports collecting metrics from multiple instances of this integration, including remote instances.

Default Behavior

Auto-Detection

By default, it detects ClickHouse instances running on localhost that are listening on port 8123. On startup, it tries to collect metrics from:

Limits

The default configuration for this integration does not impose any limits on data collection.

Performance Impact

The default configuration for this integration is not expected to impose a significant performance impact on the system.

Metrics

Metrics grouped by scope.

The scope defines the instance that the metric belongs to. An instance is uniquely identified by a set of labels.

Per ClickHouse instance

These metrics refer to the entire monitored application.

This scope has no labels.

Metrics:

MetricDimensionsUnit
clickhouse.connectionstcp, http, mysql, postgresql, interserverconnections
clickhouse.slow_readsslowreads/s
clickhouse.read_backoffread_backoffevents/s
clickhouse.memory_usageusedbytes
clickhouse.running_queriesrunningqueries
clickhouse.queries_preemptedpreemptedqueries
clickhouse.queriessuccessful, failedqueries/s
clickhouse.select_queriessuccessful, failedselects/s
clickhouse.insert_queriessuccessful, failedinserts/s
clickhouse.queries_memory_limit_exceededmem_limit_exceededqueries/s
clickhouse.longest_running_query_timelongest_query_timeseconds
clickhouse.queries_latencyqueries_timemicroseconds
clickhouse.select_queries_latencyselects_timemicroseconds
clickhouse.insert_queries_latencyinserts_timemicroseconds
clickhouse.ioreads, writesbytes/s
clickhouse.iopsreads, writesops/s
clickhouse.io_errorsread, writeerrors/s
clickhouse.io_seekslseekops/s
clickhouse.io_file_opensfile_openops/s
clickhouse.replicated_parts_current_activityfetch, send, checkparts
clickhouse.replicas_max_absolute_delareplication_delayseconds
clickhouse.replicated_readonly_tablesread_onlytables
clickhouse.replicated_data_lossdata_lossevents
clickhouse.replicated_part_fetchessuccessful, failedfetches/s
clickhouse.inserted_rowsinsertedrows/s
clickhouse.inserted_bytesinsertedbytes/s
clickhouse.rejected_insertsrejectedinserts/s
clickhouse.delayed_insertsdelayedinserts/s
clickhouse.delayed_inserts_throttle_timedelayed_inserts_throttle_timemilliseconds
clickhouse.selected_bytesselectedbytes/s
clickhouse.selected_rowsselectedrows/s
clickhouse.selected_partsselectedparts/s
clickhouse.selected_rangesselectedranges/s
clickhouse.selected_marksselectedmarks/s
clickhouse.mergesmergeops/s
clickhouse.merges_latencymerges_timemilliseconds
clickhouse.merged_uncompressed_bytesmerged_uncompressedbytes/s
clickhouse.merged_rowsmergedrows/s
clickhouse.merge_tree_data_writer_inserted_rowsinsertedrows/s
clickhouse.merge_tree_data_writer_uncompressed_bytesinsertedbytes/s
clickhouse.merge_tree_data_writer_compressed_byteswrittenbytes/s
clickhouse.uncompressed_cache_requestshits, missesrequests/s
clickhouse.mark_cache_requestshits, missesrequests/s
clickhouse.max_part_count_for_partitionmax_parts_partitionparts
clickhouse.parts_counttemporary, pre_active, active, deleting, delete_on_destroy, outdated, wide, compactparts
distributed_connectionsactiveconnections
distributed_connections_attemptsconnectionattempts/s
distributed_connections_fail_retriesconnection_retryfails/s
distributed_connections_fail_exhausted_retriesconnection_retry_exhaustedfails/s
distributed_files_to_insertpending_insertionsfiles
distributed_rejected_insertsrejectedinserts/s
distributed_delayed_insertsdelayedinserts/s
distributed_delayed_inserts_latencydelayed_timemilliseconds
distributed_sync_insertion_timeout_exceededsync_insertiontimeouts/s
distributed_async_insertions_failuresasync_insertionsfailures/s
clickhouse.uptimeuptimeseconds

Per disk

These metrics refer to the Disk.

Labels:

LabelDescription
disk_nameName of the disk as defined in the server configuration.

Metrics:

MetricDimensionsUnit
clickhouse.disk_space_usagefree, usedbytes

Per table

These metrics refer to the Database Table.

Labels:

LabelDescription
databaseName of the database.
tableName of the table.

Metrics:

MetricDimensionsUnit
clickhouse.database_table_sizesizebytes
clickhouse.database_table_partspartsparts
clickhouse.database_table_rowsrowsrows

Alerts

The following alerts are available:

Alert nameOn metricDescription
clickhouse_restarted clickhouse.uptimeClickHouse has recently been restarted
clickhouse_queries_preempted clickhouse.queries_preemptedClickHouse has queries that are stopped and waiting due to priority setting
clickhouse_long_running_query clickhouse.longest_running_query_timeClickHouse has a long-running query exceeding the threshold
clickhouse_rejected_inserts clickhouse.rejected_insertsClickHouse has INSERT queries that are rejected due to high number of active data parts for partition in a MergeTree
clickhouse_delayed_inserts clickhouse.delayed_insertsClickHouse has INSERT queries that are throttled due to high number of active data parts for partition in a MergeTree
clickhouse_replication_lag clickhouse.replicas_max_absolute_delayClickHouse is experiencing replication lag greater than 5 minutes
clickhouse_replicated_readonly_tables clickhouse.replicated_readonly_tablesClickHouse has replicated tables in readonly state due to ZooKeeper session loss/startup without ZooKeeper configured
clickhouse_max_part_count_for_partition clickhouse.max_part_count_for_partitionClickHouse high number of parts per partition
clickhouse_distributed_connections_failures clickhouse.distributed_connections_fail_exhausted_retriesClickHouse has failed distributed connections after exhausting all retry attempts
clickhouse_distributed_files_to_insert clickhouse.distributed_files_to_insertClickHouse high number of pending files to process for asynchronous insertion into Distributed tables

Setup

Prerequisites

No action required.

Configuration

File

The configuration file name for this integration is go.d/clickhouse.conf.

You can edit the configuration file using the edit-config script from the Netdata config directory.

cd /etc/netdata 2>/dev/null || cd /opt/netdata/etc/netdata
sudo ./edit-config go.d/clickhouse.conf

Options

The following options can be defined globally: update_every, autodetection_retry.

Config options
NameDescriptionDefaultRequired
update_everyData collection frequency.1no
autodetection_retryRecheck interval in seconds. Zero means no recheck will be scheduled.0no
urlServer URL.http://127.0.0.1:8123yes
timeoutHTTP request timeout.1no
usernameUsername for basic HTTP authentication.no
passwordPassword for basic HTTP authentication.no
proxy_urlProxy URL.no
proxy_usernameUsername for proxy basic HTTP authentication.no
proxy_passwordPassword for proxy basic HTTP authentication.no
methodHTTP request method.GETno
bodyHTTP request body.no
headersHTTP request headers.no
not_follow_redirectsRedirect handling policy. Controls whether the client follows redirects.nono
tls_skip_verifyServer certificate chain and hostname validation policy. Controls whether the client performs this check.nono
tls_caCertification authority that the client uses when verifying the server's certificates.no
tls_certClient TLS certificate.no
tls_keyClient TLS key.no

Examples

Basic

A basic example configuration.

jobs:
- name: local
url: http://127.0.0.1:8123

HTTP authentication

Basic HTTP authentication.

Config
jobs:
- name: local
url: http://127.0.0.1:8123
username: username
password: password

HTTPS with self-signed certificate

ClickHouse with enabled HTTPS and self-signed certificate.

Config
jobs:
- name: local
url: https://127.0.0.1:8123
tls_skip_verify: yes

Multi-instance

Note: When you define multiple jobs, their names must be unique.

Collecting metrics from local and remote instances.

Config
jobs:
- name: local
url: http://127.0.0.1:8123

- name: remote
url: http://192.0.2.1:8123

Troubleshooting

Debug Mode

To troubleshoot issues with the clickhouse collector, run the go.d.plugin with the debug option enabled. The output should give you clues as to why the collector isn't working.

  • Navigate to the plugins.d directory, usually at /usr/libexec/netdata/plugins.d/. If that's not the case on your system, open netdata.conf and look for the plugins setting under [directories].

    cd /usr/libexec/netdata/plugins.d/
  • Switch to the netdata user.

    sudo -u netdata -s
  • Run the go.d.plugin to debug the collector:

    ./go.d.plugin -d -m clickhouse

Getting Logs

If you're encountering problems with the clickhouse collector, follow these steps to retrieve logs and identify potential issues:

  • Run the command specific to your system (systemd, non-systemd, or Docker container).
  • Examine the output for any warnings or error messages that might indicate issues. These messages should provide clues about the root cause of the problem.

System with systemd

Use the following command to view logs generated since the last Netdata service restart:

journalctl _SYSTEMD_INVOCATION_ID="$(systemctl show --value --property=InvocationID netdata)" --namespace=netdata --grep clickhouse

System without systemd

Locate the collector log file, typically at /var/log/netdata/collector.log, and use grep to filter for collector's name:

grep clickhouse /var/log/netdata/collector.log

Note: This method shows logs from all restarts. Focus on the latest entries for troubleshooting current issues.

Docker Container

If your Netdata runs in a Docker container named "netdata" (replace if different), use this command:

docker logs netdata 2>&1 | grep clickhouse

Do you have any feedback for this page? If so, you can open a new issue on our netdata/learn repository.