Skip to main content

Disk Requirements & Retention

Database Modes and Tiers

Netdata comes with 3 database modes:

  1. dbengine: the default high-performance multi-tier database of Netdata. Metric samples are cached in memory and are saved to disk in multiple tiers, with compression.
  2. ram: metric samples are stored in ring buffers in memory, with increments of 1024 samples. Metric samples are not committed to disk. Kernel-Same-Page (KSM) can be used to deduplicate Netdata's memory.
  3. alloc: metric samples are stored in ring buffers in memory, with flexible increments. Metric samples are not committed to disk.

ram and alloc

Modes ram and alloc can help when Netdata should not introduce any disk I/O at all. In both of these modes, metric samples exist only in memory, and only while they are collected.

When Netdata is configured to stream its metrics to a Metrics Observability Centralization Point (a Netdata Parent), metric samples are forwarded in real-time to that Netdata Parent. The ring buffers available in these modes is used to cache the collected samples for some time, in case there are network issues, or the Netdata Parent is restarted for maintenance.

The memory required per sample in these modes, is 4 bytes:

  • ram mode uses mmap() behind the scene, and can be incremented in steps of 1024 samples (4KiB). Mode ram allows the use of the Linux kernel memory dedupper (Kernel-Same-Page or KSM) to deduplicate Netdata ring buffers and save memory.
  • alloc mode can be sized for any number of samples per metric. KSM cannot be used in this mode.

To configure database mode ram or alloc, in netdata.conf, set the following:

  • [db].mode to either ram or alloc.
  • [db].retention to the number of samples the ring buffers should maintain. For ram if the value set is not a multiple of 1024, the next multiple of 1024 will be used.

dbengine

dbengine supports up to 5 tiers. By default, 3 tiers are used, like this:

TierResolutionUncompressed Sample SizeUsually On Disk
tier0native resolution (metrics collected per-second as stored per-second)4 bytes0.6 bytes
tier160 iterations of tier0, so when metrics are collected per-second, this tier is per-minute.16 bytes6 bytes
tier260 iterations of tier1, so when metrics are collected per second, this tier is per-hour.16 bytes18 bytes

Data are saved to disk compressed, so the actual size on disk varies depending on compression efficiency.

dbegnine tiers are overlapping, so higher tiers include a down-sampled version of the samples in lower tiers:

Disk Space and Metrics Retention

You can find information about the current disk utilization of a Netdata Parent, at http://agent-ip:19999/api/v2/info. The output of this endpoint is like this:

{
// more information about the agent
// then, near the end:
"db_size": [
{
"tier": 0,
"metrics": 43070,
"samples": 88078162001,
"disk_used": 41156409552,
"disk_max": 41943040000,
"disk_percent": 98.1245269,
"from": 1705033983,
"to": 1708856640,
"retention": 3822657,
"expected_retention": 3895720,
"currently_collected_metrics": 27424
},
{
"tier": 1,
"metrics": 72987,
"samples": 5155155269,
"disk_used": 20585157180,
"disk_max": 20971520000,
"disk_percent": 98.1576785,
"from": 1698287340,
"to": 1708856640,
"retention": 10569300,
"expected_retention": 10767675,
"currently_collected_metrics": 27424
},
{
"tier": 2,
"metrics": 148234,
"samples": 314919121,
"disk_used": 5957346684,
"disk_max": 10485760000,
"disk_percent": 56.8136853,
"from": 1667808000,
"to": 1708856640,
"retention": 41048640,
"expected_retention": 72251324,
"currently_collected_metrics": 27424
}
]
}

In this example:

  • tier is the database tier.
  • metrics is the number of unique time-series in the database.
  • samples is the number of samples in the database.
  • disk_used is the currently used disk space in bytes.
  • disk_max is the configured max disk space in bytes.
  • disk_percent is the current disk space utilization for this tier.
  • from is the first (oldest) timestamp in the database for this tier.
  • to is the latest (newest) timestamp in the database for this tier.
  • retention is the current retention of the database for this tier, in seconds (divide by 3600 for hours, divide by 86400 for days).
  • expected_retention is the expected retention in seconds when disk_percent will be 100 (divide by 3600 for hours, divide by 86400 for days).
  • currently_collected_metrics is the number of unique time-series currently being collected for this tier.

So, for our example above:

Tier# Of Metrics# Of SamplesDisk UsedDisk FreeCurrent RetentionExpected RetentionSample Size
043.1K88.1 billion38.4Gi1.88%44.2 days45.0 days0.46 B
173.0K5.2 billion19.2Gi1.84%122.3 days124.6 days3.99 B
2148.3K315.0 million5.6Gi43.19%475.1 days836.2 days18.91 B

To configure retention, in netdata.conf, set the following:

  • [db].mode to dbengine.
  • [db].dbengine multihost disk space MB, this is the max disk size for tier0. The default is 256MiB.
  • [db].dbengine tier 1 multihost disk space MB, this is the max disk space for tier1. The default is 50% of tier0.
  • [db].dbengine tier 2 multihost disk space MB, this is the max disk space for tier2. The default is 50% of tier1.

Do you have any feedback for this page? If so, you can open a new issue on our netdata/learn repository.