...
Software raid: setup, management, and administration
Software raid: setup, management, and administration

Below is a full, detailed, production-ready guide for Software RAID Setup, Management, Monitoring, and Drive-Failure Recovery.

This applies to most Linux server environments including cPanel/WHM, AlmaLinux/Rocky, Ubuntu/Debian, and generic VPS/dedicated servers—and uses mdadm, the standard Linux software RAID manager.
Launch 100% ssd vps from $3. 19/mo!

⭐ Software RAID Setup, Management & Administration

(with Full Failure-Recovery Procedures)
  1. 🔍 Introduction to Software RAID

    Software RAID uses the OS kernel (via mdadm) to create and manage arrays of multiple disks for redundancy, performance, or both.

    Common RAID Levels
    Level Min Disks Purpose Fault Tolerance Notes
    RAID 0 2 Stripe 0 Performance only; not recommended for production
    RAID 1 2 Mirror 1 disk Most common for servers (OS partitions)
    RAID 5 3 Stripe + parity 1 disk Good compromise; slow rebuild; not recommended on large disks
    RAID 6 4 Stripe + dual parity 2 disks Better for large disks
    RAID 10 4 Striped Mirrors 1+ Best performance + redundancy

    mdadm software RAID is extremely common on WHM/cPanel dedicated servers and Linux VPS.

  2. 🧰 Install Required Tools

    • RHEL / AlmaLinux / Rocky
      sudo dnf install mdadm smartmontools -y
      
    • Debian / Ubuntu
      sudo apt install mdadm smartmontools -y
      
  3. 🛠️ Create a New RAID Array

    1. Identify the Disks

      lsblk
      fdisk -l
      

      Assume disks /dev/sdb and /dev/sdc.

    2. Prepare Disks (create partitions)

      Use GPT for modern layouts:

      parted /dev/sdb mklabel gpt
      parted /dev/sdb mkpart primary 0% 100%
      parted /dev/sdc mklabel gpt
      parted /dev/sdc mkpart primary 0% 100%
      

      Mark partitions as RAID:

      sudo parted /dev/sdb set 1 raid on
      sudo parted /dev/sdc set 1 raid on
      

      Partitions become:

      /dev/sdb1
      /dev/sdc1
      
  4. 🧱 Create RAID Arrays

    1. RAID 1 (Mirror)

      mdadm --create /dev/md0 --level=1 --raid-devices=2 /dev/sdb1 /dev/sdc1
      
    2. RAID 5 Example

      mdadm --create /dev/md0 --level=5 --raid-devices=3 /dev/sd[bcd]1
      
  5. 📦 Add Filesystem & Mount

    mkfs.ext4 /dev/md0
    mkdir /mnt/raid
    mount /dev/md0 /mnt/raid
    
  6. 🔖 Persist RAID Assembly

    Write array definition:

    mdadm --detail --scan >> /etc/mdadm.conf
    

    Or on Debian-based:

    mdadm --detail --scan >> /etc/mdadm/mdadm.conf
    update-initramfs -u
    
  7. 📊 Monitoring RAID Health

    1. Check RAID Status

      cat /proc/mdstat
      

      Sample output:

      md0 : active raid1 sdb1[0] sdc1[1]
            976630336 blocks [2/2] [UU]
      
      • UU = both disks healthy
      • _U = left disk failed
      • U_ = right disk failed
    2. Detailed View

      mdadm --detail /dev/md0
      
  8. 🚨 Replacing a Failed Drive (RAID1/5/6/10)

    This is the most important part for production systems.

    Symptoms of Failed Disk
    • cat /proc/mdstat shows _U or U_
    • Server logs show I/O errors
    • SMART failures: smartctl -a /dev/sdX
  9. 🧹 Step-by-Step Drive Failure Recovery

    Assume:

    • Array: /dev/md0
    • Bad disk: /dev/sdb1
    • Replacement disk: /dev/sdd
    1. Identify the Faulty Drive

      mdadm --detail /dev/md0
      

      You’ll see something like:

      Number  Major  Minor  RaidDevice State
         0     8       17        0      faulty   /dev/sdb1
         1     8       33        1      active   /dev/sdc1
      
    2. Mark Drive as Failed

      mdadm --fail /dev/md0 /dev/sdb1
      
    3. Remove the Failed Drive

      mdadm --remove /dev/md0 /dev/sdb1
      
    4. Prepare the New Drive

      If whole disk:

      parted /dev/sdd mklabel gpt
      parted /dev/sdd mkpart primary 0% 100%
      parted /dev/sdd set 1 raid on
      
    5. Add New Drive to the Array

      mdadm --add /dev/md0 /dev/sdd1
      

      Rebuild begins automatically.

      Monitor progress:

      watch cat /proc/mdstat
      
  10. 🔄 Rebuilding the Array

    Expected output during rebuild:

    [>....................]  rebuild = 5.3% (103424/1953512448) finish=120.5min speed=26000K/sec
    
  11. 🧾 Clone Partition Table Automatically (Optional Best Practice)

    If your drives must match exactly:

    sfdisk -d /dev/sdc | sfdisk /dev/sdd
    

    Then add the partition:

    mdadm --add /dev/md0 /dev/sdd1
    
  12. ⚡ Hot Spare Setup (Automatic Recovery)

    Add a spare disk:

    mdadm --add /dev/md0 /dev/sde1
    

    Verify:

    Spare Devices : 1
    

    If a disk fails, mdadm automatically pulls in the spare.

  13. 🛡️ SMART Monitoring

    Schedule SMART tests:

    Create /etc/cron.weekly/smartcheck:

    #!/bin/bash
    smartctl -t short /dev/sda
    smartctl -t short /dev/sdb
    
  14. 🔐 Email Alerts for RAID Failure

    Install mdadm mail alerts:

    Edit /etc/mdadm.conf:

    MAILADDR admin@yourdomain.com
    

    Restart:

    systemctl restart mdmonitor
    
  15. 🩺 Advanced Diagnostics

    • Check current RAID bitmap (helps fast rebuild)
      mdadm --detail /dev/md0 | grep -i bitmap
      
    • Verify stripes (RAID5/6)
      echo check > /sys/block/md0/md/sync_action
      
  16. 💣 Troubleshooting Scenarios

    1. Scenario A: RAID shows “degraded” even after rebuild

      Force re-add disk:

      mdadm --add /dev/md0 /dev/sdd1 --force
      
    2. Scenario B: md0 will not assemble on boot

      mdadm --assemble --scan
      
    3. Scenario C: Accidentally removed the wrong disk

      Re-add it:

      mdadm --add /dev/md0 /dev/sdb1
      
    4. Scenario D: Superblock errors

      Zero superblock before reuse:

      mdadm --zero-superblock /dev/sdd1
      
  17. 📦 Backup mdadm metadata (critical!)

    Save RAID definition:

    mdadm --detail --scan > /root/mdadm.backup
    

    Save disk partition tables:

    sfdisk -d /dev/sda > /root/sda.part
    sfdisk -d /dev/sdb > /root/sdb.part
    
  18. 🧹 Full Cleanup Commands (Destroy RAID)

    umount /mnt/raid
    mdadm --stop /dev/md0
    mdadm --remove /dev/md0
    mdadm --zero-superblock /dev/sd[bcd]1
    

📘 Summary: Best Practices for Software RAID Administration

✔ Always use RAID 1 or RAID 10 for critical servers
✔ Keep at least one hot spare on RAID 5/6/10
✔ Enable email alerts
✔ Monitor smartctl logs weekly
✔ Run periodic RAID checks
✔ Save /etc/mdadm.conf after any modification
✔ Use identical disks whenever possible
✔ Keep a replacement drive on hand
Launch 100% ssd vps from $3. 19/mo!

Conclusion

You now know about software RAID setup, management, and administration.

Avatar of editorial staff

Editorial Staff

Rad Web Hosting is a leading provider of web hosting, Cloud VPS, and Dedicated Servers in Dallas, TX.
lg