multipath-tools-boot relies on scsi_wait_scan module, fails multipath setup

Bug #1538775 reported by Stuart Hopkins
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
multipath-tools (Ubuntu)
Fix Released
Undecided
Unassigned
Trusty
Fix Released
Medium
Mathieu Trudel-Lapierre

Bug Description

[Impact]
Users of multipath may see an error message on every boot (when in verbose mode) about the scsi_wait_scan module being unavailable.

[Test case]
Boot 14.04 system with multipath-tools-boot. (Multipath devices installed, and the pacakge multipath-tools-boot installed).

[Regression Potential]
None. This module has been removed for a long while; as such this has no effect aside from removing an extra error message on boot.

------

Release: Ubuntu 14.04.3 LTS
Kernel: linux-image-3.16.0-59-generic
A clean system installed fresh today (2016-01-27)

In attempting to configure a system to boot-from-SAN and enable multipath support I ran into an issue whereby despite the multiple paths being detected (when running the multipath command from the CLI) the configuration wasn't being enabled at boot. After examining /usr/share/initramfs-tools/scripts/local-top/multipath I found the following:

verbose && log_begin_msg "Waiting for scsi storage"
{ rmmod scsi_wait_scan ; modprobe scsi_wait_scan ; rmmod scsi_wait_scan ; } >/dev/null 2>&1
verbose && log_end_msg

The problem appears to be that the scsi_wait_scan module doesn't exist and so there is no wait before the multipath scan is performed. I managed to observe this briefly during bootup (with the script edited) and could see it performed the scan before sda/sdb was discovered.

I also found a debian bug report indicating the module was removed a while back (https://lists.debian.org/debian-kernel/2012/05/msg00791.html).

After adding in an artificial delay for testing the multipath command does what is expected and configures the paths accordingly. I'm not sure what the correct approach is if the scsi_wait_scan module is removed.

I also found that the same local-top script doesnt have the dm-round-robin module loaded (but it is included in the initrd by the associated hook script), however I'm not sure if that is by design. I know I need it for my specific use-case, but don't know if it is deliberately excluded to prevent breakage on SAN units that don't support native round-robin.

Stuart Hopkins (stu-g)
affects: saucy-backports → multipath-tools (Ubuntu)
Changed in multipath-tools (Ubuntu):
status: New → Fix Released
Changed in multipath-tools (Ubuntu Trusty):
status: New → In Progress
importance: Undecided → Medium
assignee: nobody → Mathieu Trudel-Lapierre (mathieu-tl)
description: updated
Revision history for this message
Stuart Hopkins (stu-g) wrote :

Just to be clear, it isn't the error message that is a problem when attempting to boot, its that the system will not boot with multipath support because the multipath discovery in the initrd is performed before the SCSI devices are available. Removing the current code referring to the removed module won't actually fix the issue as the discovery will still take place before the qla2xxx driver has had a chance to discover the LUN's, thus the root fs will be mounted on /dev/sda instead of /dev/mapper/mpath0, and cannot be changed later (as the LUN is currently active).

This is why I had to add in a sleep statement so that the multipath scan takes place after the devices have been discovered. Not a clean solution at all, but I am unsure as to what replaced the scsi_wait_scan given the async changes.

Revision history for this message
Mathieu Trudel-Lapierre (cyphermox) wrote :

This is already fixed in 0.5.0 so Fix Released for xenial; not applicable to wily (which also has the fix already).

Work to land this fix in 14.04 in progress.

Revision history for this message
Brian Murray (brian-murray) wrote : Please test proposed package

Hello Stuart, or anyone else affected,

Accepted multipath-tools into trusty-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/multipath-tools/0.4.9-3ubuntu7.8 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in multipath-tools (Ubuntu Trusty):
status: In Progress → Fix Committed
tags: added: verification-needed
Revision history for this message
Mathieu Trudel-Lapierre (cyphermox) wrote :

Stuart,

I'm aware of that, but there are two different issues at play here: the detection and scanning of drives, and the fact that scsi_wait_scan is still in the script. The upload should likely fix both; but I'm closing this bug here because it's about scsi_wait_scan. The other bugs are referenced in the uploaded package, linked in the previous comment.

Revision history for this message
Stuart Hopkins (stu-g) wrote :

Brian,

Package tested (0.4.9-3ubuntu7.8) and can confirm that it does resolve the problem around loading the missing module. It doesn't however fix the issue of the multipath root device working (as my system is back to sda rather than mpath). As per Matthiew's comment (4) I will update LP:1526984 as that is a closer match to the current symptoms (though its not a complete match).

tags: added: verification-done
removed: verification-needed
Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

Hi,

Given the conversation/explanation on this bug (1: problem is lack of SCSI wait scan, and 2: this bug targets to remove the error message), and my additional comments on bug 1526984 comment #18 (summary: the SCSI wait scan problem is not supposed to be fixed with either udevadm settle / multipathd), I think we can mark this as verification-done (message is removed), and move on to how to obtain similar effects to scsi wait scan, or an effective fix for that problem.

Mathieu Trudel-Lapierre (mathieu-tl) #4
> [snip] I'm closing this bug here because it's about scsi_wait_scan.

Stuart Hopkins (stu-g) #5
> [snip] confirm that it does resolve the problem around loading the missing module.

@mathieu-tl, I'm remarking this a verification-done, given you specified the scope in the bug description.
and @stu-g, I'd really appreciate seeing some form of fix to this kind of issue. I realize the sleep workaround works (even suggested that myself.. heh. got a flat no). Ideas are welcome. Maybe that behavior should be reported on another bug.

Revision history for this message
Mathieu Trudel-Lapierre (cyphermox) wrote :

FWIW, it's not so much as a flat no for fixing this with a sleep as that any possible value for sleep (unless it's unreasonably high) will still fail for some people... and high values are just inconvenient since they slow the boot down. It's generally just not a very useful general fix, even if it works locally as a workaround.

Stuart, did you file a new bug for the actual scan wait process for devices?

Revision history for this message
Stuart Hopkins (stu-g) wrote :

Haven't had a chance to raise it yet (having to use the environment for something else atm so I wouldn't be able to capture logs), though plan to do it next week once the environment is free.

Some thoughts around potential workarounds/fixes:
- Boot option (similar to rootwait/rootdelay) to impose a sleep period (this way its optional and flexible)
- Package to go alongside multipath-boot that adds a new script (which in turn waits), and executes before multipathd
- Getting udev to force a scan of the SCSI bus and wait for it to finish (i.e. indicate there is nothing new)
- Figuring out why the qla2xxx driver doesn't seem to like me...

Revision history for this message
Brian Murray (brian-murray) wrote :

Hello Stuart, or anyone else affected,

Accepted multipath-tools into trusty-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/multipath-tools/0.4.9-3ubuntu7.9 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

tags: removed: verification-done
tags: added: verification-needed
Revision history for this message
Mathieu Trudel-Lapierre (cyphermox) wrote :

This was already verified successfully before; the additional update does not need reverification, only checking the regression in bug 1543430.

tags: added: verification-done
removed: verification-needed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package multipath-tools - 0.4.9-3ubuntu7.9

---------------
multipath-tools (0.4.9-3ubuntu7.9) trusty; urgency=medium

  * debian/patches/kpartx-support-device-names-with-spaces.patch: fix loopback
    files unmapping. (LP: #1543430)

multipath-tools (0.4.9-3ubuntu7.8) trusty; urgency=medium

  * debian/patches/kpartx-support-device-names-with-spaces.patch: deal with
    spaces in device names in kpartx too (LP: #1432062)
  * debian/initramfs/local-premount: wait for udev to settle before mounting
    so the by-uuid/ symlinks have a chance to be updated by udev rules.
    (LP: #1503286)
  * Allow device detection all through the initramfs: run multipathd instead
    of only scanning once for devices, so those that come up slower can still
    be used as a root device (LP: #1526984):
    - debian/patches/0050-readonly-bindings_prefix.patch,
      debian/patches/0051-readonly-bindings_multipath.patch,
      debian/patches/0052-readonly-bindings_multipathd.patch,
      debian/patches/0053-readonly-bindings_multipathd_prod.patch: support -B
      to allow multipathd to handle cases where the bindings file is read-only.
    - debian/initramfs/hooks: install multipathd and required directories.
    - debian/initramfs/local-premount: also reload all maps to make sure
      they're ready before we mount.
    - debian/initramfs/local-top: run multipathd rather than a one-off call to
      multipath so that new paths can be correctly added as detected while
      we're still in the initramfs.
    - debian/initramfs/local-bottom: remember to stop multipathd.
    - debian/initramfs/local-bottom, debian/rules: install local-bottom for
      initramfs.
  * debian/patches/lp1496210_add_IBM_XIV_defaults.patch: add support (default
    config values) for the IBM 2810XIV storage system. (LP: #1496210)
  * debian/patches/0054-kpartx-update-option.patch: run kpartx -u rather than
    kpartx -a, so as to remove old partition entries if the partition table
    has changed. (LP: #1473903)
  * debian/patches/multipath_enable_sync_support_1b8082c8.patch,
    debian/patches/kpartx_rely_on_udev_dev_creation_9a632fff.patch: synchronize
    udev, device-mapper and multipath, and let udev deal with creating device
    nodes and symlinks. (LP: #1486370)
  * debian/initramfs/local-top: drop scsi_wait_scan stanza, that module is no
    longer available. (LP: #1538775)

 -- Mathieu Trudel-Lapierre <email address hidden> Tue, 09 Feb 2016 16:03:10 -0500

Changed in multipath-tools (Ubuntu Trusty):
status: Fix Committed → Fix Released
Revision history for this message
Adam Conrad (adconrad) wrote : Update Released

The verification of the Stable Release Update for multipath-tools has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.