Ubuntu

Improving Reliability of Software RAID

Registered by Clint Byrum on 2011-04-28

mdadm currently has a gaggle of open bugs, and every cycle the RAID ISO tests produce new and interesting bugs. It seems like we're doing something a bit wrong with Software RAID. There are some valid solutions here: https://wiki.ubuntu.com/ReliableRaid which should be discussed and either refuted or implemented.

Read the full specification

Blueprint information

Status:: Started

Approver:: Steve Langasek

Priority:: High

Drafter:: Dimitri John Ledkov

Direction:: Approved

Assignee:: Dimitri John Ledkov

Definition:: Approved

Series goal:: Accepted for raring

Implementation:: Started

Milestone target:: ubuntu-13.04

Started by: Steve Langasek on 2012-05-30

Related branches

Related bugs

Bug #136252: [->UUIDudev] mdadm.conf w/o ARRAY lines but udev/mdadm not assembling arrays. (boot & hotplug fails)	Confirmed
Bug #158918: [->UUIDudev] installing mdadm (or outdated mdadm.conf) breaks bootup	Triaged
Bug #244808: {upstream} --incremental --scan --run does not start anything	Confirmed
Bug #251164: boot impossible due to missing initramfs failure hook / event driven initramfs	Triaged
Bug #251646: {upstream} missing command to start single arrays in "auto-read" mode (like --incremental)	Confirmed
Bug #252345: raid setups fail due to mdadm.conf with explicit ARRAY statements and HOMEHOST !=any	Confirmed
Bug #259145: degraded NON-root raids never --run on boot	Confirmed
Bug #320638: hot-add/remove in mixed (IDE/SATA/USB/SD-card/...) RAIDs with device mapper on top => data corruption (bio too big device md0 (248 > 240))	Won't Fix
Bug #488317: [->Initramfs] no degraded boot (arbitrary lvm, luks, raid, auth, ... combinations)	Confirmed
Bug #496478: server guide: warn about using ubuntu with raid or provide reliable raid config	Expired
Bug #497186: initramfs' init-premount degrades all arrays (not just those required to boot)	Confirmed
Bug #535417: mdadm monitor feature broken, email notification not set up, nor using beep/wall/notify-send	Confirmed
Bug #804427: RAID fails after suspend	Invalid
Bug #925280: Software RAID fails to rebuild after testing degraded cold boot	Confirmed
Bug #942106: software raid doesn't assemble before mount on boot	Fix Released
Bug #957494: Missing added utility 'mdmon'	Fix Released
Bug #968074: Partitionable raid ignored by 65-mdadm-blkid.rules	Fix Released
Bug #969384: mdadm --detail --scan segfaults during update-initramfs	Fix Released
Bug #1002357: sort out udev rules madness (3 editions installed into 4 files)	Fix Released
Bug #1009973: SRU upstream bugfix micro point release 3.2.5	Fix Released
Bug #1020914: The disk drive for /boot is not yet ready or not present	New
Bug #1088532: pluging in a missing raid member does not (re)add it to array	Confirmed

Sprints

Whiteboard

Past Points:
[kees] Collect historical work done on improving raid: TODO
[kees] Write detailed specification of mdadm initramfs requirements: TODO
[kees] Write detailed specification of mdadm post-initramfs requirements: TODO
Test RAID over LVM and LVM over RAID: TODO

Notes from etherpad:

there are a lot of bugs

- put together a tree of failure conditions
- map the intention of how to deal with it
- check existing code against intention, fix deltas
- suggest/recommend smart monitoring in servers
- Look into automated testing of ALL supported RAID modes
- Test case for LVM over RAID
- Investigate and test booting without mdadm.conf
- Investigate not autostarting certain arrays
- Interface with upstream for feedback (invite to UDS-P)
- Fully document (maybe in conjunction with upstream?) and review existing documentation around software raid debugging and general maintenance.
Multiple locations for various disk/array information can be used to diagnose what is where, and who's done what, for example:
** ll /dev/disk/by-id/
** mdadm --detail /dev/md127
** cat /proc/mdstat
** messages from kernel include references to "ata*.*" with no easy way to trace it back to a "real" /dev/sd* device
** lshw
** how to identify out of the drives, which is which? often "dd if=/dev/sd* of=/dev/null" (where * is the failed drive) and see which drive is solidly active.
- The upstream documentation https://raid.wiki.kernel.org/index.php/Linux_Raid is very basic when it comes to diagnostics or failure conditions, also quite outdated in many situations.

Intentions
- Preserve Data Integrity
- Detect Known Failure Modes as early as possible
- Allow system to run and reboot fine even if partial hardware failures occur(ed).
- Provide Options for how to handle failure modes:
- The BOOT_DEGRADED=false option allows admins to safeguard against mdadm bugs and to do recovery manually, but should default to true.

Failure Modes
- Degraded array at boot
- Failed drive at runtime
- Removed drive at runtime (where metadata is intact)
- Adding out-of-sync drive
- Adding failed drive
- Drives producing corrupt reads without failure
- Old RAID configuration resurrection
- LVM starting up on mirror halves
- Hardware is fail_ing_ (SMART details)
  - can we link the SMART data to some kind of user reporting?

Validate the behavior of drivers/arrays/drives
- does the driver notice a yanked drive?
- does the driver notice a failed drive?
- how does the driver react to a new drive getting inserted?
** New drive being inserted with alternate raid config
** Some controllers are hot swap, some not, how to identify?

Debugging failure cases (user side?)
- logic to align dm/ata information as expressed in dmesg etc with the /dev/sdx devices that mdadm knows about

Links:
https://wiki.ubuntu.com/HotplugRaid

drussell 2011-05-17: Added more content to the etherpad session post UDS...
slangasek 2011-10-31: this has been reproposed for a session at UDS-P, but I don't think any of the facts have changed. Why is another session needed for this?
cbyrum 2011-10-31: Agreed Steve, this just needs to get done, nothing has changed.
drussell - 2011-10-31: Absolutely agreed... so how do we focus on getting this done?
dmitrij.ledkov 2012-05-18: Adding foundations-q-degraded-hw-notification as a dependency for sending degraded raid notifications to the user & integrating SMART notifications.

foundations-q-event-based-initramfs dependency is not fully determined yet. It is pending on investigation of current RAID deficiencies. Only then event-based-initramfs might become a hard dependency.

(?)

Work Items

Work items:
create (Ubuntu|Upstream) RAID Architecture Specification: INPROGRESS
update existing (i.e. out-of-date) RAID documentation: TODO
identify and document all failure conditions: TODO
investigate if foundations-q-event-based-initramfs is required for completing this spec (UPDATE it is due to cryptsetup): DONE
test existing handling of failure modes for RAID: TODO
establish automated RAID testing and RAID failure condition testing: BLOCKED
review blueprint linked bugs for likelihood of fixing this cycle: DONE
get input from kees on the whole topic of Reliable Raid: DONE
backport critical reliability fixes to 12.04 LTS: DONE

Dependency tree

* Blueprints in grey have been implemented.

This blueprint contains Public information

Everyone can see this information.

Subscribers

Andy Whitcroft

blood-stone

Brian Murray

ceg

Christopher Townsend

Chuck Short

Clint Byrum

Colin Ian King

Colin Watson

Craig Magina

Dave Russell

Dimitri John Ledkov

James Hunt

James Page

Jamie Strandboge

Jean-Philippe Guérard

Jeff Underhill

John Center

john

Jools Wills

Jorge Castro

Kees Cook

mahmoh

Marc Cluet

Michael DePaulo

Michael Prokop

Peter Matulis

Peter Petrakis

Rex Tsai

Stefan Bader

Stefano Rivera

Steve Beattie

Steve Langasek

Stéphane Graber

Surbhi Palande

Tobias Bradtke