Raid 1 not working

MattWinter

Active Member
Joined
Dec 5, 2022
Messages
179
Reaction score
223
Credits
1,314
I'm hoping some of the experts can help me out with this. There's a lot going on here.
I have a dual boot system running Arch and Gentoo. I had a 2TB raid1 drive (2 mirrored 2TB drives), and recently (about a month ago) installed a 12TB standalone drive. I'm also using reFind if that matters. I partitioned the new 12 TB drive using GNU parted.

The raid can no longer be mounted. I don't know if this has anything to do with the new drive. It may not. We also had a power outage recently.
I've tried reassembling the array (md127 using sda1 and sdb1). I've also tried rebuilding the array. In both cases, it looks like it succeeds, but when I run mdadm --examine, I get the error: No md superblock detected on /dev/md127.
Running smartctl on sda1 revealed errors, so I tried all of the above using sdb1 only. After rebuilding with mdadm --build, It looks like I get valid info when running mdadm --examine, but mounting still fails with error: unknown filesystem type 'linux_raid_member'

I just installed testdisk hoping to recover data off the drive and put it on that 12TB. However, I got an error saying the drive was full. Running df -h only shows 2TB available on the 12TB drive, though lsblk shows 10.9T. I don't know if this is related.

In diagnosing this issue, there were a lot of other weird errors my system was running in to. Sometimes the machine would boot, but the USB keyboard wouldn't work. This was fixed by plugging the keyboard in to a new port. It now works back in the old port. Sometimes when trying to restart the machine, it would hand and I would just get a black screen. A few times I would try to boot to CloneZilla, but again, the machine would hang. I'm not sure if there's more going on, or if this was just a result of the machine trying to read a bad raid array.

Anyone have any thoughts here?
 


Oh, another thing I thought was odd. When I get mdadm --build /dev/md127 to work by using sdb1 alone, I get the following output from lsblk. md126 comes up below md127. I don't know why or where md126 came from.

Code:
sdb           8:16   0   1.8T  0 disk
└─sdb1        8:17   0   1.8T  0 part
  └─md127     9:127  0   1.8T  0 raid1
    └─md126   9:126  0     0B  0 md
 
I have a raid device on on my Proxmox server which looks like this.
Code:
sda 8:0 0 1.8T 0 disk
└─sda1                           8:1    0   1.8T  0 part 
  └─md0                          9:0    0   1.8T  0 raid1
    ├─vgvm-lvlan               253:0    0   1.5T  0 lvm   /data/store/vms/lan
    └─vgvm-lvlab               253:1    0   330G  0 lvm   /data/store/vms/lab
When I compare it to your output it looks like your md126 device has become dependent on your md127 device which is quite odd. And what I also notice is that your md126 device has a size of 0 bytes. To which physical disk should do the md126 device belong to, I would check out that disk for i/o or other errors?
 
Last edited:

Members online


Top