ES|QL

Outlier hosts via z-score (STD_DEV)

STD_DEV measures CPU-load dispersion per host; the spike's z-score ((max − mean) / standard deviation) tells a truly anomalous spike apart from normally volatile load.

Prerequisites

Elasticsearch 8.15+, Metricbeat

SQL
FROM "metrics-system.cpu-*"
| WHERE @timestamp >= NOW() - 6 hours
| EVAL cpu_pct = system.cpu.total.norm.pct * 100
| STATS
    cpu_moy = ROUND(AVG(cpu_pct), 1),
    cpu_ecart_type = ROUND(STD_DEV(cpu_pct), 1),
    cpu_max = ROUND(MAX(cpu_pct), 1)
  BY host.name
| EVAL z_max = ROUND((cpu_max - cpu_moy) / cpu_ecart_type, 1)
| WHERE z_max > 3
| SORT z_max DESC
| LIMIT 20

Result

host.name    | cpu_moy | cpu_ecart_type | cpu_max | z_max
-------------+---------+----------------+---------+------
db-prod-03   |    22.4 |            4.1 |    97.8 |  18.4
cache-redis2 |    18.6 |            3.9 |    51.0 |   8.3
web-prod-12  |    41.3 |            6.2 |    88.9 |   7.7
batch-node-1 |    35.0 |            8.7 |    99.2 |   7.4
STD_DEVZ-scoreCPUAnomalie

Related snippets

Back to the Data Lab