i Netdata – Real-Time performance and health monitoring solution for Linux – All things in moderation

Netdata – Real-Time performance and health monitoring solution for Linux

Netdata is a open-source system for distributed real-time performance and health monitoring. It provides unparalleled insights, in real-time, of everything happening on the systems it runs (including containers and applications such as web and database servers), using modern interactive web dashboards. It is fast and efficient, designed to permanently run on all systems (physical & virtual servers, containers, IoT devices), without disrupting their core function.

It should run on any Linux system (including IoT).
– Arch Linux
– CentOS
– Debian
– Fedora
– Gentoo
– openSUSE
– PLD Linux
– RedHat Enterprise Linux
– SUSE
– Ubuntu

It also run on FreeBSD and MacOS.

netdata also supports:

  • monitoring ephemeral nodes and auto-scaled containers,

  • integration with existing monitoring infrastructure (time-series databases like prometheus, graphite, opentsdb) and third-party event notification methods (like slack, pagerduty, pushover, and dozens more),

  • building hierarchies of monitored nodes via real-time metrics streaming between them,

  • embedding charts and dashboards on third party web sites and applications, such as Atlassian’s Confluence.

Monitor Features:

Following is a list of what it currently monitors:

CPU:
usage, interrupts, softirqs, frequency, total and per core, CPU states

Memory:
RAM, swap and kernel memory usage, KSM (Kernel Samepage Merging), NUMA

Disks:
per disk: I/O, operations, backlog, utilization, space, software RAID (md)

Network interfaces:
per interface: bandwidth, packets, errors, drops

IPv4 networking:
bandwidth, packets, errors, fragments, tcp: connections, packets, errors, handshake, udp: packets, errors, broadcast: bandwidth, packets, multicast: bandwidth, packets

IPv6 networking:
bandwidth, packets, errors, fragments, ECT, udp: packets, errors, udplite: packets, errors, broadcast: bandwidth, multicast: bandwidth, packets, icmp: messages, errors, echos, router, neighbor, MLDv2, group membership, break down by type

Interprocess Communication – IPC:
such as semaphores and semaphores arrays

netfilter / iptables Linux firewall:
connections, connection tracker events, errors

Linux DDoS protection:
SYNPROXY metrics

fping latencies:
for any number of hosts, showing latency, packets and packet loss

Processes:
running, blocked, forks, active

Entropy:
random numbers pool, using in cryptography

NFS file servers and clients:
NFS v2, v3, v4: I/O, cache, read ahead, RPC calls

Network QoS:
the only tool that visualizes network tc classes in realtime

Linux Control Groups:
containers: systemd, lxc, docker

Applications:
by grouping the process tree and reporting CPU, memory, disk reads, disk writes, swap, threads, pipes, sockets – per group

Users and User Groups resource usage:
by summarizing the process tree per user and group, reporting: CPU, memory, disk reads, disk writes, swap, threads, pipes, sockets

Apache and lighttpd web servers:
mod-status (v2.2, v2.4) and cache log statistics, for multiple servers

Nginx web servers:
stub-status, for multiple servers

Tomcat:
accesses, threads, free memory, volume

web server log files:
extracting in real-time, web server performance metrics and applying several health checks

mySQL databases:
multiple servers, each showing: bandwidth, queries/s, handlers, locks, issues, tmp operations, connections, binlog metrics, threads, innodb metrics, and more

Postgres databases::
multiple servers, each showing: per database statistics (connections, tuples read – written – returned, transactions, locks), backend processes, indexes, tables, write ahead, background writer and more

Redis databases:
multiple servers, each showing: operations, hit rate, memory, keys, clients, slaves

couchdb:
reads/writes, request methods, status codes, tasks, replication, per-db, etc

mongodb:
operations, clients, transactions, cursors, connections, asserts, locks, etc

memcached databases:
multiple servers, each showing: bandwidth, connections, items

elasticsearch:
search and index performance, latency, timings, cluster statistics, threads statistics, etc

ISC Bind name servers:
multiple servers, each showing: clients, requests, queries, updates, failures and several per view metrics

NSD name servers:
queries, zones, protocols, query types, transfers, etc.

PowerDNS:
queries, answers, cache, latency, etc.

Postfix email servers:
message queue (entries, size)

exim email servers:
message queue (emails queued)

Dovecot POP3/IMAP servers

ISC dhcpd:
pools utilization, leases, etc.

IPFS:
bandwidth, peers

Squid proxy servers:
multiple servers, each showing: clients bandwidth and requests, servers bandwidth and requests

HAproxy:
bandwidth, sessions, backends, etc

varnish:
threads, sessions, hits, objects, backends, etc

OpenVPN
status per tunnel

Hardware sensors:
lm_sensors and IPMI: temperature, voltage, fans, power, humidity

NUT and APC UPSes:
load, charge, battery voltage, temperature, utility metrics, output metrics

PHP-FPM:
multiple instances, each reporting connections, requests, performance

hddtemp
disk temperatures

smartd:
disk S.M.A.R.T. values

SNMP devices:
can be monitored too (although you will need to configure these)

chrony
frequencies:, offsets, delays, etc.

beanstalkd:
global and per tube monitoring

statsd:
netdata is a fully featured statsd server

ceph:
OSD usage, Pool usage, number of objects, etc.

And you can extend it, by writing plugins that collect data from any source, using any computer language.

netdata core values

Value netdata
high resolution metrics collects all metrics every single second
unlimited metrics collects thousands of metrics per monitored node
real-time visualization dashboards run with sub-second latency, collection to visualization
powerful anomaly detection has a distributed watchdog embedded in it, running on all monitored nodes
visual anomaly detection dashboards are optimized for spotting anomalies, across all metrics
meaningful presentation dashboards present all metrics in a structured, easy to understand, way
zero configuration auto-detects all metrics and comes with dozens of alarms
resource utilization core is optimized C code, using <1% utilization of single CPU core

Installation for linux system

The latest release of netdata can be easily installed on linux using your package manager:

  • Arch Linux (sudo pacman -S netdata)
  • Alpine Linux (sudo apk add netdata)
  • Debian Linux (sudo apt install netdata)
  • Gentoo Linux (sudo emerge –ask netdata)
  • OpenSUSE (sudo zypper install netdata)
  • Solus Linux (sudo eopkg install netdata)
  • Ubuntu Linux >= 18.04 (sudo apt install netdata)

On any modern Linux system, you can use one line installation script that will install latest netdata and also keep it up to date automatically.

    bash <(curl -Ss https://my-netdata.io/kickstart.sh         # On 32-bit   
    bash <(curl -Ss https://my-netdata.io/kickstart-static64.sh)  # On 64-bit

After installation, netdata run default on port 19999, you can configure this port by change value of the session default port in the /etc/netdata/netdata.conf config file.

Netdata config monitoring cache

By default config netdata sync monitor system every 1 second and save cache 1 hour (3600 history entry). Change this config in the /etc/netdata/netdata.conf config file:

In [global] section options:

setting info
update every The frequency in seconds, for data collection. For more information see Performance.
history The number of entries the netdata daemon will by default keep in memory for each chart dimension. This setting can also be configured per chart. Check Memory Requirements for more information.

Run live demo: http://my-netdata.io/

Read more document for installation and configure netdata https://github.com/netdata/netdata/wiki/

Leave a Reply