Netdata is a open-source system for distributed real-time performance and health monitoring. It provides unparalleled insights, in real-time, of everything happening on the systems it runs (including containers and applications such as web and database servers), using modern interactive web dashboards. It is fast and efficient, designed to permanently run on all systems (physical & virtual servers, containers, IoT devices), without disrupting their core function.
It should run on any Linux system (including IoT).
– Arch Linux
– CentOS
– Debian
– Fedora
– Gentoo
– openSUSE
– PLD Linux
– RedHat Enterprise Linux
– SUSE
– Ubuntu
It also run on FreeBSD and MacOS.
netdata also supports:
- monitoring ephemeral nodes and auto-scaled containers,
-
integration with existing monitoring infrastructure (time-series databases like prometheus, graphite, opentsdb) and third-party event notification methods (like slack, pagerduty, pushover, and dozens more),
-
building hierarchies of monitored nodes via real-time metrics streaming between them,
-
embedding charts and dashboards on third party web sites and applications, such as Atlassian’s Confluence.
Monitor Features:
Following is a list of what it currently monitors:
CPU:
usage, interrupts, softirqs, frequency, total and per core, CPU states
Memory:
RAM, swap and kernel memory usage, KSM (Kernel Samepage Merging), NUMA
Disks:
per disk: I/O, operations, backlog, utilization, space, software RAID (md)
Network interfaces:
per interface: bandwidth, packets, errors, drops
IPv4 networking:
bandwidth, packets, errors, fragments, tcp: connections, packets, errors, handshake, udp: packets, errors, broadcast: bandwidth, packets, multicast: bandwidth, packets
IPv6 networking:
bandwidth, packets, errors, fragments, ECT, udp: packets, errors, udplite: packets, errors, broadcast: bandwidth, multicast: bandwidth, packets, icmp: messages, errors, echos, router, neighbor, MLDv2, group membership, break down by type
Interprocess Communication – IPC:
such as semaphores and semaphores arrays
netfilter / iptables Linux firewall:
connections, connection tracker events, errors
Linux DDoS protection:
SYNPROXY metrics
fping latencies:
for any number of hosts, showing latency, packets and packet loss
Processes:
running, blocked, forks, active
Entropy:
random numbers pool, using in cryptography
NFS file servers and clients:
NFS v2, v3, v4: I/O, cache, read ahead, RPC calls
Network QoS:
the only tool that visualizes network tc classes in realtime
Linux Control Groups:
containers: systemd, lxc, docker
Applications:
by grouping the process tree and reporting CPU, memory, disk reads, disk writes, swap, threads, pipes, sockets – per group
Users and User Groups resource usage:
by summarizing the process tree per user and group, reporting: CPU, memory, disk reads, disk writes, swap, threads, pipes, sockets
Apache and lighttpd web servers:
mod-status (v2.2, v2.4) and cache log statistics, for multiple servers
Nginx web servers:
stub-status, for multiple servers
Tomcat:
accesses, threads, free memory, volume
web server log files:
extracting in real-time, web server performance metrics and applying several health checks
mySQL databases:
multiple servers, each showing: bandwidth, queries/s, handlers, locks, issues, tmp operations, connections, binlog metrics, threads, innodb metrics, and more
Postgres databases::
multiple servers, each showing: per database statistics (connections, tuples read – written – returned, transactions, locks), backend processes, indexes, tables, write ahead, background writer and more
Redis databases:
multiple servers, each showing: operations, hit rate, memory, keys, clients, slaves
couchdb:
reads/writes, request methods, status codes, tasks, replication, per-db, etc
mongodb:
operations, clients, transactions, cursors, connections, asserts, locks, etc
memcached databases:
multiple servers, each showing: bandwidth, connections, items
elasticsearch:
search and index performance, latency, timings, cluster statistics, threads statistics, etc
ISC Bind name servers:
multiple servers, each showing: clients, requests, queries, updates, failures and several per view metrics
NSD name servers:
queries, zones, protocols, query types, transfers, etc.
PowerDNS:
queries, answers, cache, latency, etc.
Postfix email servers:
message queue (entries, size)
exim email servers:
message queue (emails queued)
Dovecot POP3/IMAP servers
ISC dhcpd:
pools utilization, leases, etc.
IPFS:
bandwidth, peers
Squid proxy servers:
multiple servers, each showing: clients bandwidth and requests, servers bandwidth and requests
HAproxy:
bandwidth, sessions, backends, etc
varnish:
threads, sessions, hits, objects, backends, etc
OpenVPN
status per tunnel
Hardware sensors:
lm_sensors and IPMI: temperature, voltage, fans, power, humidity
NUT and APC UPSes:
load, charge, battery voltage, temperature, utility metrics, output metrics
PHP-FPM:
multiple instances, each reporting connections, requests, performance
hddtemp
disk temperatures
smartd:
disk S.M.A.R.T. values
SNMP devices:
can be monitored too (although you will need to configure these)
chrony
frequencies:, offsets, delays, etc.
beanstalkd:
global and per tube monitoring
statsd:
netdata is a fully featured statsd server
ceph:
OSD usage, Pool usage, number of objects, etc.
And you can extend it, by writing plugins that collect data from any source, using any computer language.
netdata core values
Value | netdata |
---|---|
high resolution metrics | collects all metrics every single second |
unlimited metrics | collects thousands of metrics per monitored node |
real-time visualization | dashboards run with sub-second latency, collection to visualization |
powerful anomaly detection | has a distributed watchdog embedded in it, running on all monitored nodes |
visual anomaly detection | dashboards are optimized for spotting anomalies, across all metrics |
meaningful presentation | dashboards present all metrics in a structured, easy to understand, way |
zero configuration | auto-detects all metrics and comes with dozens of alarms |
resource utilization | core is optimized C code, using <1% utilization of single CPU core |
Installation for linux system
The latest release of netdata can be easily installed on linux using your package manager:
- Arch Linux (sudo pacman -S netdata)
- Alpine Linux (sudo apk add netdata)
- Debian Linux (sudo apt install netdata)
- Gentoo Linux (sudo emerge –ask netdata)
- OpenSUSE (sudo zypper install netdata)
- Solus Linux (sudo eopkg install netdata)
- Ubuntu Linux >= 18.04 (sudo apt install netdata)
On any modern Linux system, you can use one line installation script that will install latest netdata and also keep it up to date automatically.
bash <(curl -Ss https://my-netdata.io/kickstart.sh # On 32-bit bash <(curl -Ss https://my-netdata.io/kickstart-static64.sh) # On 64-bit
After installation, netdata run default on port 19999, you can configure this port by change value of the session default port in the /etc/netdata/netdata.conf config file.
Netdata config monitoring cache
By default config netdata sync monitor system every 1 second and save cache 1 hour (3600 history entry). Change this config in the /etc/netdata/netdata.conf config file:
In [global] section options:
setting | info |
---|---|
update every | The frequency in seconds, for data collection. For more information see Performance. |
history | The number of entries the netdata daemon will by default keep in memory for each chart dimension. This setting can also be configured per chart. Check Memory Requirements for more information. |
Run live demo: http://my-netdata.io/
Read more document for installation and configure netdata https://github.com/netdata/netdata/wiki/