Web server log files

Plugin: go.d.plugin Module: web_log

Overview

This collector monitors web servers by parsing their log files.

This collector is supported on all platforms.

This collector supports collecting metrics from multiple instances of this integration, including remote instances.

Default Behavior

Auto-Detection

It automatically detects log files of web servers running on localhost.

Limits

The default configuration for this integration does not impose any limits on data collection.

Performance Impact

The default configuration for this integration is not expected to impose a significant performance impact on the system.

Metrics

Metrics grouped by scope.

The scope defines the instance that the metric belongs to. An instance is uniquely identified by a set of labels.

Per Web server log files instance

These metrics refer to the entire monitored application.

This scope has no labels.

Metrics:

Metric	Dimensions	Unit
web_log.requests	requests	requests/s
web_log.excluded_requests	unmatched	requests/s
web_log.type_requests	success, bad, redirect, error	requests/s
web_log.status_code_class_responses	1xx, 2xx, 3xx, 4xx, 5xx	responses/s
web_log.status_code_class_1xx_responses	a dimension per 1xx code	responses/s
web_log.status_code_class_2xx_responses	a dimension per 2xx code	responses/s
web_log.status_code_class_3xx_responses	a dimension per 3xx code	responses/s
web_log.status_code_class_4xx_responses	a dimension per 4xx code	responses/s
web_log.status_code_class_5xx_responses	a dimension per 5xx code	responses/s
web_log.bandwidth	received, sent	kilobits/s
web_log.request_processing_time	min, max, avg	milliseconds
web_log.requests_processing_time_histogram	a dimension per bucket	requests/s
web_log.upstream_response_time	min, max, avg	milliseconds
web_log.upstream_responses_time_histogram	a dimension per bucket	requests/s
web_log.current_poll_uniq_clients	ipv4, ipv6	clients
web_log.vhost_requests	a dimension per vhost	requests/s
web_log.port_requests	a dimension per port	requests/s
web_log.scheme_requests	http, https	requests/s
web_log.http_method_requests	a dimension per HTTP method	requests/s
web_log.http_version_requests	a dimension per HTTP version	requests/s
web_log.ip_proto_requests	ipv4, ipv6	requests/s
web_log.ssl_proto_requests	a dimension per SSL protocol	requests/s
web_log.ssl_cipher_suite_requests	a dimension per SSL cipher suite	requests/s
web_log.url_pattern_requests	a dimension per URL pattern	requests/s
web_log.custom_field_pattern_requests	a dimension per custom field pattern	requests/s

Per custom time field

TBD

This scope has no labels.

Metrics:

Metric	Dimensions	Unit
web_log.custom_time_field_summary	min, max, avg	milliseconds
web_log.custom_time_field_histogram	a dimension per bucket	observations

Per custom numeric field

TBD

This scope has no labels.

Metrics:

Metric	Dimensions	Unit
web_log.custom_numeric_field_{{field_name}}_summary	min, max, avg	{{units}}

Per URL pattern

TBD

This scope has no labels.

Metrics:

Metric	Dimensions	Unit
web_log.url_pattern_status_code_responses	a dimension per pattern	responses/s
web_log.url_pattern_http_method_requests	a dimension per HTTP method	requests/s
web_log.url_pattern_bandwidth	received, sent	kilobits/s
web_log.url_pattern_request_processing_time	min, max, avg	milliseconds

Alerts

The following alerts are available:

Alert name	On metric	Description
web_log_1m_unmatched	web_log.excluded_requests	percentage of unparsed log lines over the last minute
web_log_1m_requests	web_log.type_requests	ratio of successful HTTP requests over the last minute (1xx, 2xx, 304, 401)
web_log_1m_redirects	web_log.type_requests	ratio of redirection HTTP requests over the last minute (3xx except 304)
web_log_1m_bad_requests	web_log.type_requests	ratio of client error HTTP requests over the last minute (4xx except 401)
web_log_1m_internal_errors	web_log.type_requests	ratio of server error HTTP requests over the last minute (5xx)
web_log_web_slow	web_log.request_processing_time	average HTTP response time over the last 1 minute
web_log_5m_requests_ratio	web_log.type_requests	ratio of successful HTTP requests over over the last 5 minutes, compared with the previous 5 minutes

Setup

Prerequisites

No action required.

Configuration

File

The configuration file name for this integration is go.d/web_log.conf.

The file format is YAML. Generally, the structure is:

update_every: 1
autodetection_retry: 0
jobs:
  - name: some_name1
  - name: some_name1

You can edit the configuration file using the edit-config script from the Netdata config directory.

cd /etc/netdata 2>/dev/null || cd /opt/netdata/etc/netdata
sudo ./edit-config go.d/web_log.conf

Options

Weblog is aware of how to parse and interpret the following fields (known fields):

nginx

apache

nginx	apache	description
$host ($http_host)	%v	Name of the server which accepted a request.
$server_port	%p	Port of the server which accepted a request.
$scheme	-	Request scheme. "http" or "https".
$remote_addr	%a (%h)	Client address.
$request	%r	Full original request line. The line is "$request_method $request_uri $server_protocol".
$request_method	%m	Request method. Usually "GET" or "POST".
$request_uri	%U	Full original request URI.
$server_protocol	%H	Request protocol. Usually "HTTP/1.0", "HTTP/1.1", or "HTTP/2.0".
$status	%s (%>s)	Response status code.
$request_length	%I	Bytes received from a client, including request and headers.
$bytes_sent	%O	Bytes sent to a client, including request and headers.
$body_bytes_sent	%B (%b)	Bytes sent to a client, not counting the response header.
$request_time	%D	Request processing time.
$upstream_response_time	-	Time spent on receiving the response from the upstream server.
$ssl_protocol	-	Protocol of an established SSL connection.
$ssl_cipher	-	String of ciphers used for an established SSL connection.

Notes:

Apache %h logs the IP address if HostnameLookups is Off. The web log collector counts hostnames as IPv4 addresses. We recommend either to disable HostnameLookups or use %a instead of %h.
Since httpd 2.0, unlike 1.3, the %b and %B format strings do not represent the number of bytes sent to the client, but simply the size in bytes of the HTTP response. It will differ, for instance, if the connection is aborted, or if SSL is used. The %O format provided by mod_logio will log the actual number of bytes sent over the network.
To get %I and %O working you need to enable mod_logio on Apache.
NGINX logs URI with query parameters, Apache doesnt.
$request is parsed into $request_method, $request_uri and $server_protocol. If you have $request in your log format, there is no sense to have others.
Don't use both $bytes_sent and $body_bytes_sent (%O and %B or %b). The module does not distinguish between these parameters.

Config options

Name	Description	Default	Required
update_every	Data collection frequency.	1	no
autodetection_retry	Recheck interval in seconds. Zero means no recheck will be scheduled.	0	no
path	Path to the web server log file.		yes
exclude_path	Path to exclude.	*.gz	no
url_patterns	List of URL patterns.	[]	no
url_patterns.name	Used as a dimension name.		yes
url_patterns.pattern	Used to match against full original request URI. Pattern syntax in matcher.		yes
log_type	Log parser type.	auto	no
csv_config	CSV log parser config.		no
csv_config.delimiter	CSV field delimiter.	,	no
csv_config.format	CSV log format.		no
ltsv_config	LTSV log parser config.		no
ltsv_config.field_delimiter	LTSV field delimiter.	\t	no
ltsv_config.value_delimiter	LTSV value delimiter.	:	no
ltsv_config.mapping	LTSV fields mapping to known fields.		yes
json_config	JSON log parser config.		no
json_config.mapping	JSON fields mapping to known fields.		yes
regexp_config	RegExp log parser config.		no
regexp_config.pattern	RegExp pattern with named groups.		yes

url_patterns

"URL pattern" scope metrics will be collected for each URL pattern.

Option syntax:

url_patterns:
  - name: name1
    pattern: pattern1
  - name: name2
    pattern: pattern2

log_type

Weblog supports 5 different log parsers:

Parser type	Description
auto	Use CSV and auto-detect format
csv	A comma-separated values
json	JSON
ltsv	LTSV
regexp	Regular expression with named groups

Syntax:

log_type: auto

If log_type parameter set to auto (which is default), weblog will try to auto-detect appropriate log parser and log format using the last line of the log file.

checks if format is CSV (using regexp).
checks if format is JSON (using regexp).

assumes format is CSV and tries to find appropriate CSV log format using predefined list of formats. It tries to parse the line using each of them in the following order (the first one matches is used later):

$host:$server_port $remote_addr - - [$time_local] "$request" $status $body_bytes_sent - - $request_length $request_time $upstream_response_time
$host:$server_port $remote_addr - - [$time_local] "$request" $status $body_bytes_sent - - $request_length $request_time
$host:$server_port $remote_addr - - [$time_local] "$request" $status $body_bytes_sent     $request_length $request_time $upstream_response_time
$host:$server_port $remote_addr - - [$time_local] "$request" $status $body_bytes_sent     $request_length $request_time
$host:$server_port $remote_addr - - [$time_local] "$request" $status $body_bytes_sent
$remote_addr - - [$time_local] "$request" $status $body_bytes_sent - - $request_length $request_time $upstream_response_time
$remote_addr - - [$time_local] "$request" $status $body_bytes_sent - - $request_length $request_time
$remote_addr - - [$time_local] "$request" $status $body_bytes_sent     $request_length $request_time $upstream_response_time
$remote_addr - - [$time_local] "$request" $status $body_bytes_sent     $request_length $request_time
$remote_addr - - [$time_local] "$request" $status $body_bytes_sent

If you're using the default Apache/NGINX log format, auto-detect will work for you. If it doesn't work you need to set the format manually.

csv_config.format

ltsv_config.mapping

The mapping is a dictionary where the key is a field, as in logs, and the value is the corresponding known field.

Note: don't use $ and % prefixes for mapped field names.

log_type: ltsv
ltsv_config:
  mapping:
    label1: field1
    label2: field2

json_config.mapping

The mapping is a dictionary where the key is a field, as in logs, and the value is the corresponding known field.

Note: don't use $ and % prefixes for mapped field names.

log_type: json
json_config:
  mapping:
    label1: field1
    label2: field2

regexp_config.pattern

Use pattern with subexpressions names. These names should be known fields.

Note: don't use $ and % prefixes for mapped field names.

Syntax:

log_type: regexp
regexp_config:
  pattern: PATTERN

Examples

There are no configuration examples.

Troubleshooting

Debug Mode

Important: Debug mode is not supported for data collection jobs created via the UI using the Dyncfg feature.

To troubleshoot issues with the web_log collector, run the go.d.plugin with the debug option enabled. The output should give you clues as to why the collector isn't working.

Navigate to the plugins.d directory, usually at /usr/libexec/netdata/plugins.d/. If that's not the case on your system, open netdata.conf and look for the plugins setting under [directories].
```
cd /usr/libexec/netdata/plugins.d/
```
Switch to the netdata user.
```
sudo -u netdata -s
```

Run the go.d.plugin to debug the collector:

./go.d.plugin -d -m web_log

To debug a specific job:

./go.d.plugin -d -m web_log -j jobName

Getting Logs

If you're encountering problems with the web_log collector, follow these steps to retrieve logs and identify potential issues:

Run the command specific to your system (systemd, non-systemd, or Docker container).
Examine the output for any warnings or error messages that might indicate issues. These messages should provide clues about the root cause of the problem.

System with systemd

Use the following command to view logs generated since the last Netdata service restart:

journalctl _SYSTEMD_INVOCATION_ID="$(systemctl show --value --property=InvocationID netdata)" --namespace=netdata --grep web_log

System without systemd

Locate the collector log file, typically at /var/log/netdata/collector.log, and use grep to filter for collector's name:

grep web_log /var/log/netdata/collector.log

Note: This method shows logs from all restarts. Focus on the latest entries for troubleshooting current issues.

Docker Container

If your Netdata runs in a Docker container named "netdata" (replace if different), use this command:

docker logs netdata 2>&1 | grep web_log

Do you have any feedback for this page? If so, you can open a new issue on our netdata/learn repository.

Overview​

Default Behavior​

Auto-Detection​

Limits​

Performance Impact​

Metrics​

Per Web server log files instance​

Per custom time field​

Per custom numeric field​

Per URL pattern​

Alerts​

Setup​

Prerequisites​

Configuration​

File​

Options​

url_patterns​

log_type​

csv_config.format​

ltsv_config.mapping​

json_config.mapping​

regexp_config.pattern​

Examples​

Troubleshooting​

Debug Mode​

Getting Logs​

System with systemd​

System without systemd​

Docker Container​

Overview

Default Behavior

Auto-Detection

Limits

Performance Impact

Metrics

Per Web server log files instance

Per custom time field

Per custom numeric field

Per URL pattern

Alerts

Setup

Prerequisites

Configuration

File

Options

url_patterns

log_type

csv_config.format

ltsv_config.mapping

json_config.mapping

regexp_config.pattern

Examples

Troubleshooting

Debug Mode

Getting Logs

System with systemd

System without systemd

Docker Container