I am trying to control the prometheus alert for same expr to be triggered only after 4 hours if it already triggered first time. I have appname and operation configured in the configuration with threshold for generating the alert. For same appname and operation, can we control the alerts to get triggered only after specififed time. (First time it should get triggered after 5m)
- alert: DummyAlert
expr: sum by (appname, operation)(increase(mycounter{appname="abc", operation="xyz"} [5m])) > 0
for: 5m
labels:
severity: critical
maintainedby: test
teamname: test
annotations:
summary: test alert
description: 'test alert'
Alerts are being acknowledge and there is no repetition of same alert. Can someone suggest any change in the alert?
Since you want that timeout to span time even between two different triggers of the alert, configuration of alertmanager cannot solve this for you, since all data for resolved alert is dropped, and cannot be referenced within alertmanager itself.
But luckily you can incorporate a check, that would prevent alert from triggering more frequent then once every four hours. For this you can use build-in pseudometric
ALERTSof Prometheus.Here,
last_over_time(ALERTS{alertname="DummyAlert", alertstate="firing"} [4h])will return alerts with nameDummyAlertthat fired during last four hours. And whole query will yield result only if initial query returns some result, andDummyAlertfor sameappnameandoperationwasn't fired during those last four hours.