diff options
author | 2024-06-15 03:57:56 +0100 | |
---|---|---|
committer | 2024-06-16 11:30:18 +0100 | |
commit | 967fc64187bc87f8e079ab8cf973f2e9a2e8030b (patch) | |
tree | e4378ddcf3e60392fde89cdc801d986b3b8d9efe /kubernetes | |
parent | Add documentation to ff-bot.yml policy file (diff) |
Add Kubernetes volume alerts
It seems that Linode has added storage reporting info to the CSI driver
allowing us to pick up on the storage use of persistent volume claims
within the cluster.
This creates and deploys an alert that will report if any volume has
under 10% of space left. I have excluded Prometheus as our TSDB
retention settings mean that it will always stay just below it's volume
size by design.
Diffstat (limited to 'kubernetes')
-rw-r--r-- | kubernetes/namespaces/monitoring/alerts/alerts.d/volumes.yaml | 11 |
1 files changed, 11 insertions, 0 deletions
diff --git a/kubernetes/namespaces/monitoring/alerts/alerts.d/volumes.yaml b/kubernetes/namespaces/monitoring/alerts/alerts.d/volumes.yaml new file mode 100644 index 0000000..790d3f7 --- /dev/null +++ b/kubernetes/namespaces/monitoring/alerts/alerts.d/volumes.yaml @@ -0,0 +1,11 @@ +groups: +- name: volumes + rules: + - alert: KubernetesVolumeOutOfDiskSpace + expr: kubelet_volume_stats_available_bytes{persistentvolumeclaim!="prometheus-storage"} / kubelet_volume_stats_capacity_bytes * 100 < 10 + for: 2m + labels: + severity: page + annotations: + summary: Kubernetes Volume {{ $labels.kubernetes_namespace }}/{{ $labels.persistentvolumeclaim }} is running low on disk space + description: "Volume is almost full (< 10% left)\n VALUE = {{ $value }}\n LABELS = {{ $labels }}" |