Manage alerting rules
Define when to generate alerts and notifications with alerting rules.
In Alerts or Project settings → Management → Rules you can:
- Create and edit rules
- Manage rules including enabling/disabling, muting/unmuting, and deleting
- Drill down to rule details
- Configure rule settings
For an overview of alerting concepts, go to Rules.
Create and edit rules
When you click the Create rule button, it launches a flyout that guides you through selecting a rule type and configuring its conditions and actions.
The rule types available in an Elasticsearch project are:
After a rule is created, you can open the action menu (…) and select Edit rule to re-open the flyout and change the rule properties.
You can also manage rules as resources with the Elasticstack provider for Terraform. For more details, refer to the elasticstack_kibana_alerting_rule resource.
Snooze and disable rules
The rule listing enables you to quickly snooze, disable, enable, or delete individual rules. For example, you can change the state of a rule:
When you snooze a rule, the rule checks continue to run on a schedule but the alert will not trigger any actions. You can snooze for a specified period of time, indefinitely, or schedule single or recurring downtimes:
When a rule is in a snoozed state, you can cancel or change the duration of this state.
Import and export rules
To import and export rules, use saved objects.
Rules are disabled on export. You are prompted to re-enable the rule on successful import.
View rule details
You can determine the health of a rule by looking at its Last response. A rule can have one of the following responses:
failed
- The rule ran with errors.
succeeded
- The rule ran without errors.
warning
- The rule ran with some non-critical errors.
Click the rule name to access a rule details page:
In this example, the rule detects when a site serves more than a threshold number of bytes in a 24 hour period. Four sites are above the threshold. These are called alerts - occurrences of the condition being detected - and the alert name, status, time of detection, and duration of the condition are shown in this view. Alerts come and go from the list depending on whether the rule conditions are met.
When an alert is created, it generates actions. If the conditions that caused the alert persist, the actions run again according to the rule notification settings. There are three common alert statuses:
active
- The conditions for the rule are met and actions should be generated according to the notification settings.
flapping
- The alert is switching repeatedly between active and recovered states.
recovered
- The conditions for the rule are no longer met and recovery actions should be generated.
Flapping alerts
The flapping
state is possible only if you have enabled alert flapping detection in Rules → Settings. A look back window and threshold are used to determine whether alerts are flapping. For example, you can specify that the alert must change status at least 6 times in the last 10 runs. If the rule has actions that run when the alert status changes, those actions are suppressed while the alert is flapping.
If there are rule actions that failed to run successfully, you can see the details on the History tab. In the Message column, click the warning or expand icon or click the number in the Errored actions column to open the Errored Actions panel.
You can suppress future actions for a specific alert by turning on the Mute toggle. If a muted alert no longer meets the rule conditions, it stays in the list to avoid generating actions if the conditions recur. You can also disable a rule, which stops it from running checks and clears any alerts it was tracking. You may want to disable rules that are not currently needed to reduce the load on your cluster.