Hot-swapping SCSI-devices in linux

Recently I often had issues with failing hard disks. And sadly, it isn’t even something special. At first I didn’t mind too much about it but the older the drives became, the more recently the hard drives died. During the last weeks they started dying by a daily rate. Oh hardware failures are so annoying. Walking all the long way down to the darn cold server room, put a new drive in a frame, plug out the old drive, put in a new drive. So far I won’t be able to change something about it. And then…reboot the machine. On the internet are people being proud of their machines rebooting in a few seconds from BIOS to Windows, but the machines I have to use do certainly not offer a chance to be proud of them. Booting takes ages. Around 5 minutes I’d say. And yes, I have to stand right there and check that the new drive works. What a waste of time. Since the hardware theoretically supports hot-plugging, I thought I could at least get around the last step – and thus save me roughly 5 minutes of time and certainly a terrible cold – if I manage to hot-plug the device not only theoretically but practically.

Ok, this actually isn’t a big deal on linux. If you did it once, you’ll know how it works. So just for those who care and for me if I ever should forget how it’s done, here’s a short description of what has to be done to get a new hard disk detected by a running linux:

  • Eagerly pull out the old, damaged drive while cursing badly. If you can, blame the manufacture several times.
  • Take a brand new hard disk. Hold it under your nose and enjoy the smell of freshness for once. Put it in the slot where you removed the old, crappy disk.
  • Ok, now you should log in to the machine. Something with root-privileges please. I don’t want to put that stupid ‘sudo’ before every line…
  • Type in the following (replace <host> with the host you suspect the new device to be attached to. In doubt, repeat it for all hosts you find):
  • You now should see the new device! Yey!

Ok, what did you do here? Well, you just sent a command to your scsi_host to rescan all channels, SCSI target IDs, and LUNs. Yes, you got it right, dashes are wildcards here and thus this basically means “rescan everything”.

Now that the new device is added, it might be a good idea to (re-)build the partition table and RAID-arrays if it’s applicable for your case. At least to me it is and so it will be added here even if it actually is a bit off-topic. Simple partitioning isn’t hard, just run your favorite tool. Try cfdisk, sfdisk or parted. Whatever suites you best. Rebuilding the raid array might be more interesting. This is highly depending on your setup, I usually just RAID1 (mirror) two drives. This is how I do it:

Mirror the partition table from one drive (sda) to the other (the new one, let’s call it sdb):

Next step is to check the raid:

You should see your RAID is corrupted and degraded. To get it back up again, several ways might work depending on how damaged it is. I usually just remove any partitions that are lost due to one drive dying and add the corresponding partition on the new drive. This might work for you:

Check the status with --detail and see if it works. Since I don’t want to go too deep into how to use mdadm here this should be enough for now.

So, I hope it helped someone. Braindump finished.

Leave a Reply

Your email address will not be published. Required fields are marked *