I have received two notifications from our main server today about failed disks in a raid array. This appears to be a software raid setup years before I started at the company by a past staff member.
I have never setup or managed raid on linux before, so before I start making any changes or running commands I am hoping to better understand what is going on and how to potentially fix it.
The two notifications I have received are:
A Fail event had been detected on md device /dev/md/1.
It could be related to component device /dev/sda1.
Faithfully yours, etc.
P.S. The /proc/mdstat file currently contains the following:
Personalities : [raid1] md0 : active raid1 sdb2[1] sda20 480192320 blocks super 1.0 [2/1] [_U] bitmap: 4/4 pages [16KB], 65536KB chunk
md1 : active raid1 sdb1[1] sda10 8188864 blocks super 1.2 [2/1] [_U]
unused devices:
and:
A Fail event had been detected on md device /dev/md/0.
It could be related to component device /dev/sda2.
Faithfully yours, etc.
P.S. The /proc/mdstat file currently contains the following:
Personalities : [raid1] md0 : active raid1 sdb2[1] sda20 480192320 blocks super 1.0 [2/1] [_U] bitmap: 3/4 pages [12KB], 65536KB chunk
md1 : active raid1 sdb1[1] sda1[0] 8188864 blocks super 1.2 [2/2] [UU]
unused devices:
I know that there are two physical disks in the server both 500gb/465gb usable. To me these notifications suggest that one disk is md0 split into two partitions, and the other is md1 also split into two partitions.
Is that correct? Or is this actually showing that one of the physical disks has failed and needs replacing?