Skip to content

RAID1 device missing alert not present #7913

@nrauso

Description

@nrauso

We recently added RAID alerts to check RAID health ( see #7463 ), but a relevant scenario is currently not covered.
If an instance starts with an already degraded RAID1 array (for example, a disk is missing at boot time), the system boots with a "device missing" status but no alert is raised.
As a result, administrators may not be notified that the RAID is degraded.

Steps to reproduce

  1. Configure a RAID1 array.
  2. Stop the instance or node.
  3. Remove or detach one of the RAID devices.
  4. Start the instance again.

Expected behavior

An alert should be triggered when the system starts with a degraded RAID array (device missing).

Actual behavior

The system boots successfully with the RAID in degraded state, but no alert is generated.

We have to add an alert rule that detects degraded RAID arrays at boot time (device missing state).
The following Prometheus alert template should be integrated into the metrics app:
https://samber.github.io/awesome-prometheus-alerts/rules.html#rule-host-and-hardware-1-24

This rule would allow detection of RAID devices that are missing or degraded when the node starts.

Components

  • metrics:1.2.3

Metadata

Metadata

Labels

No labels
No labels

Type

Projects

Status

In Progress

Relationships

None yet

Development

No branches or pull requests

Issue actions