Support GRUB's native root=ZFS=

Bug #1530953 reported by Richard Laager
46
This bug affects 5 people
Affects Status Importance Assigned to Milestone
zfs-initramfs (Ubuntu)
Fix Released
Medium
Colin Ian King

Bug Description

Upstream GRUB sets root=ZFS=rpool/path/to/rootfs by default. By supporting that syntax (while retaining the existing support for backwards compatibility), we can work out-of-the-box.

I've attached a patch for zfs-initramfs to implement this. Note that since this is creating a new file in the debian/directory, you have to apply this patch to the packaging, not put the file in debian/patches/. If you put the file in debian/patches, debuild will choke.

The patch has two parts. One, creating conf.d/zfs to set BOOT=zfs. This avoids the need to set boot=zfs on the Linux command-line in the GRUB configuration. The second part is a set of changes to scripts/zfs which grabs the ZFS= part from root= and stores that into ZFS_BOOTFS and ZFS_RPOOL.

Submitted upstream a year ago as:
https://github.com/zfsonlinux/pkg-zfs/pull/140

Tags: patch
Revision history for this message
Richard Laager (rlaager) wrote :
Richard Laager (rlaager)
description: updated
Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

The attachment "grub-native-root-syntax.patch" seems to be a patch. If it isn't, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are a member of the ~ubuntu-reviewers, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issues please contact him.]

tags: added: patch
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in zfs-initramfs (Ubuntu):
status: New → Confirmed
Changed in zfs-initramfs (Ubuntu):
importance: Undecided → Medium
no longer affects: grub
Revision history for this message
Colin Ian King (colin-king) wrote :

Which release are you filing the bug against?

Changed in zfs-initramfs (Ubuntu):
assignee: nobody → Colin Ian King (colin-king)
Revision history for this message
Colin Ian King (colin-king) wrote :
Revision history for this message
Colin Ian King (colin-king) wrote :

Looking at the commit, is there any reason that this has not been accepted by upstream?

Noting comment by ryao on 2 Oct 2015:

"If I recall correctly, the root=ZFS=$DATASET syntax is myself and either @dajhorn or @rlaager agreed to use to adopt as something of a standard back in 2012. I would like to revisit it because the root=ZFS=$DATASET syntax does not properly handle pool namespace collisions. Ideally, I would like the boot loader to pass the information about the pool by passing the pool's GUID and vdev information (using by-path syntax) to the initramfs archive through the kernel commandline so that we can handle spa namespace collisions completely unambiguously. I have yet to make time to work out a syntax that makes sense and implement the hooks required for it. When this is done, I think the old syntax should remain around for backward compatibility.

Also, I believe that it is possible to put the required logic into the kernel by allowing us to detect the rootfs mount via current->fs == NULL, but I still need to verify this is NULL at the rootfs mount with kgdb. This would hypothetically allow us to do pool import inside the kernel using this information without an initramfs when copy-builtin is used to build a kernel with ZFS support. It would also simplify initramfs design in the normal case where one is loading modules by allowing us to omit ZFS' userland binaries from the initramfs. All we would need to do is use the standard mount command.

This is something of a tangent, but it is the direction that my thinking on the boot process has taken as it has evolved."

So from what I understand, this fix is a partial solution because it does not properly handle pool namespace collisions.

Changed in zfs-initramfs (Ubuntu):
status: Confirmed → In Progress
Revision history for this message
Richard Laager (rlaager) wrote :

If we want to try to handle pool name collisions (and frankly, I'm not convinced that we need to), that's a separate issue. That would involve changing GRUB as well. This patch is a good improvement on its own. If we want to enhance things more later, we can do that.

Revision history for this message
Richard Laager (rlaager) wrote :

(13:17:38) rlaager: ryao: Your comment on https://github.com/zfsonlinux/pkg-zfs/pull/140 is seeming to give the Ubuntu maintainer pause about accepting that patch. My position is that pool namespace collisions are a separate issue (which will require changes to GRUB as well), and that the patch as written is a useful improvement. Is that something you can agree with?
(13:41:32) ryao: rlaager: This is a good stopgap measure until someone takes time to implement a more complete solution. It isn't like merging this means GRUB2's format is going to change.
(13:42:31) ryao: To put it another way, merging this isn't going to making addressing the collision issue any harder and will help int he short term. It is fine to merge.
(13:48:49) ryao: rlaager: I just replied there.

Revision history for this message
Colin Ian King (colin-king) wrote :

Thanks for the update Richard.

Revision history for this message
Colin Ian King (colin-king) wrote :

Richard,

I've applied the change and uploaded it to my ppa:colin-king/zfs-xenial-experimental

I would appreciate it if you can test this for me, and for reference, explain the testing procedure you used so I can try this myself.
The build package should be ready to test by 17:00 UTC.

Colin

Revision history for this message
Richard Laager (rlaager) wrote :

In my original patch, I added BOOT=zfs code to conf.d/zfs. In your package, you have that code in conf-hooks.d/zfs instead.

I installed the package from your PPA. It does not work. If I move the code from conf-hooks.d/zfs to conf.d/zfs, then it works.

Here's my HOWTO for doing a ZFS root install: https://github.com/rlaager/zfs/wiki/HOWTO-install-Ubuntu-to-a-Native-ZFS-Root-Filesystem

For xenial, just substitute "xenial" for "wily" when doing the debootstrap and in /etc/apt/sources.list.

The goal of this change is to avoid what is currently step 4.7, adding "rpool=rpool bootfs=rpool/ROOT/ubuntu boot=zfs" to GRUB_CMDLINE_LINUX in /etc/default/grub.

Revision history for this message
Colin Ian King (colin-king) wrote :

Sorry about that, stupid mistake by me.

I've uploaded a new version into a different PPA, can you check this now works?

ppa:colin-king/zfs-xenial-lp1530953

Thanks

Changed in zfs-initramfs (Ubuntu):
status: In Progress → Incomplete
Revision history for this message
Colin Ian King (colin-king) wrote :

Followed the excellent instructions in https://github.com/rlaager/zfs/wiki/HOWTO-Install-Ubuntu-to-a-Native-ZFS-Root-Filesystem and using the updated zfs-initramfs package I can now boot just using root=ZFS=rpool/ROOT/ubuntu ro

Thanks for the patch, I'll get this uploaded into Xenial

Changed in zfs-initramfs (Ubuntu):
status: Incomplete → In Progress
Revision history for this message
Hajo Möller (dasjoe) wrote :

As a sidenote, this coupled with #1530457 should bring full support for a standard and portable rpool setup to Ubuntu, making it one of the first distributions that is able to boot natively off a pool with all currently implemented ZFS feature flags enabled/active.

Revision history for this message
Richard Laager (rlaager) wrote :

Can you also ship an /etc/udev/rules.d/90-zfs-vdev.rules with these contents?

KERNEL=="sd*[0-9]", IMPORT{parent}=="ID_*", ENV{ID_FS_TYPE}=="zfs_member", SYMLINK+="$env{ID_BUS}-$env{ID_SERIAL}-part%n"

This makes GRUB work if the user uses /dev/disk/by-id/x names, which upstream ZFS-on-Linux recommends. Without that, the user is required to use /dev/sdx names.

Note that this ONLY supports root-on-ZFS-on-partitions. Using partitions is what the upstream HOWTO recommends, what the current version of my HOWTO recommends (but not past versions), how other filesystems are configured on Linux, and what Solaris and its derivatives use for root-on-ZFS (which is also the default there).

I fought the good fight trying to make ZFS "wholedisk" root installs a thing on Linux, but that cannot work well without changes to GRUB. GRUB is simply not architected in a way to handle this well. (I wrote a patch to GRUB that makes it work, but it feels hacky and was rejected by upstream GRUB.) Fundamentally, the idea of giving ZFS the whole disk but then having ZFS partition the disk and use a partition is just messy. It's definitely a layering violation.

So for now, I'd like to see Ubuntu support root-on-ZFS-on-partitions. If circumstances change, it's easy to expand that later. But if you start supporting root-on-ZFS-on-wholedisk, then you're kinda stuck supporting that forever. So I'd rather see the conservative approach of only supporting partitions for now.

Revision history for this message
Richard Laager (rlaager) wrote :

Actually, I think the numbering for that udev file should NOT be in the 90s. I think the 90s are for the system administrator. So we probably want 60-zfs-vdev.rules if it's shipped by the package.

Revision history for this message
Hajo Möller (dasjoe) wrote :

I am not sure whether adding an udev rule that may (or may not, if not using wholedisk) work is a good idea.

Although, until grub's behavior is fixed shipping a sensible workaround seems useful, if the rule file can include a bit of documentation like a short explanation why it is necessary and commented examples for other setups.

Revision history for this message
Richard Laager (rlaager) wrote :

I'm attaching a debdiff which outlines the changes necessary to ship a rules.d file. It uses proper numbering in the name, is installed in the right place, and has the right rule. I've tested the package with this change applied. It also has a comment explaining what it does and why it's necessary.

If we feel it is necessary to provide an example for wholedisk setups, then I think we should ship an additional file in /etc (so it's modifiable by the administrator) with the commented out example. I can do that, if that's what is decided is desirable.

Also, I've removed the incorrect Vcs-Git and Vcs-Browser lines which say that the source of this package is still github.com/dajhorn/pkg-git.

Revision history for this message
Colin Ian King (colin-king) wrote :

I'll roll this into Xenial zfs-linux after we've got 0.6.5.4 uploaded (hopefully sometime this week).

Revision history for this message
Colin Ian King (colin-king) wrote :

Richard, I've put your changes into the 0.6.5.4 and uploaded it into the following test ppa: https://launchpad.net/~colin-king/+archive/ubuntu/zfs-0.6.5.4-alpha

Do you mind giving this a test and if it looks sane I'll upload it next week.

Revision history for this message
Richard Laager (rlaager) wrote :

This looks solid. One thing I wanted to note... putting the udev rules in zfs-initramfs means that someone installing a system needs to install zfs-initramfs outside of the chroot. If that file was in zfsutils-linux, they wouldn't have to install the extra package. For my HOWTO, it doesn't really matter. And at the moment, I still think it's the right call to ship the udev rules in zfs-initramfs (since they're only needed in that scenario). But I wanted to make sure you're aware of this detail.

Revision history for this message
Richard Laager (rlaager) wrote :

Actually, it looks like we need a `udevadm trigger` in zfs-initramfs's postinst script. Installing the package does not currently result in those devices appearing.

Revision history for this message
Richard Laager (rlaager) wrote :

Attached is a debdiff relative to the alpha PPA. This is the last of the changes I have already implemented for the boot process.

It reworks the zfs initramfs script to more closely follow the upstream Debian (non-zfs) initramfs scripts. Instead of using a ZFS-specific /etc/default parameter to add a minimum delay to the boot process, it uses the rootdelay= kernel command-line parameter used in other Debian initramfs scripts, and makes it a maximum delay. This means that administrators who need to add a delay (which is not the default) can set a larger delay for safety without forcing that extra delay on every boot.

It uses the standard Debian panic function, which then honors the panic= kernel command-line option, which can be set to disable the initramfs panic shell for security reasons. It also follows the upstream goal of clearing all the environment variables which are set.

This is a net code reduction.

Here's a slightly different version of that explanation:
https://github.com/zfsonlinux/pkg-zfs/commit/50f30514c35d2ba2d915e0b327932b260eecce09

This change is explicitly and intentionally drops support for the rpool= and bootfs= parameters. I would like to see them gone before ZFS support hits an Ubuntu LTS release, because it'll be harder to drop that support later. If you would like to maintain support for rpool= and bootfs= but like the rest of the changes, I can update the script to that effect.

Revision history for this message
Colin Ian King (colin-king) wrote :

Thanks for the patch Richard. I've uploaded ~2 to the PPA and it should be ready for testing in a short while. Let me know if this works OK and then I can re-test this on monday and sync it to Xenial.

Revision history for this message
Richard Laager (rlaager) wrote :

We still need the `udevadm trigger` so the udev rule applies without a reboot. Attached is a debdiff. This postinst follows the pattern that other packages use. I've verified that it has the desired effect.

Revision history for this message
Richard Laager (rlaager) wrote :

Also, the changelog entry for the udev rules file is wrong. The correct filename is: /lib/udev/rules.d/60-zpool.rules

Revision history for this message
Colin Ian King (colin-king) wrote :

Hi Richard, thanks for the patch, I've applied this and made the changelog entry change and uploaded a ~3 version in ppa:colin-king/zfs-0.6.5.4-alpha - if you don't mind a further test on this and if it looks good I'll get it pushed.

Revision history for this message
Richard Laager (rlaager) wrote :

I've attached a new 60-zpool.rules. Instead of matching on ENV{ID_FS_TYPE}=="zfs_member", it uses ENV{ID_PART_ENTRY_SCHEME}=="gpt", ENV{ID_PART_ENTRY_TYPE}=="6a898cc3-1dd2-11b2-99a6-080020736631".

That is, it matches based on the partition type. This way, the rule matches when the partition is created, rather than when the pool is created. Otherwise, the rule doesn't trigger on the initial pool creation.

Revision history for this message
Colin Ian King (colin-king) wrote :

Thanks again Richard, I've uploaded ~4 to the PPA and it's ready for testing.

Revision history for this message
Colin Ian King (colin-king) wrote :

If you can test the ~4 and let us know if that's good, then please do, otherwise I can't upload it.

Revision history for this message
Richard Laager (rlaager) wrote :

Yes, that works fine.

I'm seeing deadlocks on 0.6.5.4, but that has nothing to do with this change.

Changed in zfs-initramfs (Ubuntu):
status: In Progress → Fix Released
Revision history for this message
Chad Miller (cmiller) wrote :

ZFS is so new to Linux that the paritioning tools we have don't have a ZFS in the Linux section of GPT partition types. We can find it in FreeBSD, Solaris, and Apple.

Someone is likely to pick something with Linux in the name like Linux Filesystem. That would be wrong for this patch.

The FreeBSD one would be wrong for this patch too.

udev can look at the filesystem information instead of the parition type id too. So, I suggest a new rule to catch all ZFS partitions by their contents.

KERNEL=="sd*[0-9]", IMPORT{parent}=="ID_*", ENV{ID_PART_ENTRY_SCHEME}=="gpt", ENV{ID_FS_TYPE}=="zfs_member", ENV{ID_FS_USAGE}=="raid", SYMLINK+="$env{ID_BUS}-$env{ID_SERIAL}-part%n"

Revision history for this message
Chad Miller (cmiller) wrote :
Revision history for this message
Colin Ian King (colin-king) wrote :

OK, if you attach a debdiff I'll apply this for the next upload

Revision history for this message
Richard Laager (rlaager) wrote :

Chad, I'm not opposed to Ubuntu including both, but if you look at the comment history, you'll see that I switched from zfs_member to the partition type for a reason. `zpool create` doesn't trigger udev, whereas partitioning tools do (by triggering the kernel to re-read the partition table). As a result, if we use a "zfs_member" rule, the udev rule doesn't fire until we either reboot or manually run `udevadm trigger`. That doesn't work well for an initial install (which is currently a manual process, but ideally some day we'll have installer support).

Revision history for this message
Colin Ian King (colin-king) wrote :

@Richard, do you have any views on this potential fix?

Revision history for this message
Richard Laager (rlaager) wrote :

@colin-king: It looks like we were commenting at the same time. Including *both* rules is not harmful in any *technical* way. The only concern I have is that then you're "supporting" other partition types, which may "obligate" you to do that "forever" (quotes because none of these things are literally true). That is, once you start supporting other partition types, people will expect that to not go away suddenly.

Revision history for this message
Chad Miller (cmiller) wrote :

Make this work, even for MBR disklabel, and GPT disklabel with wrong partition type id.

Neither the disklabel kind nor the partition type id are a necessary part of creating ZFS pools, We shouldn't require it here too.

Changed in zfs-initramfs (Ubuntu):
status: Fix Released → In Progress
Revision history for this message
Chad Miller (cmiller) wrote :

You're right, richard. The right thing to do is to fix grub. Having these rules at all is Wrong. Once grub is fixed, we will rip these out. We won't need to support them at all then.

Revision history for this message
Richard Laager (rlaager) wrote :

Why do you have ENV{ID_FS_USAGE}=="raid"?

Revision history for this message
Chad Miller (cmiller) wrote :

Richard, It was on all the zpool devices I checked with "udevadm info", and seemed to be related to the filesystem contents. That's all. Why?

Revision history for this message
Richard Laager (rlaager) wrote :

As far as I know, it's not a necessary condition. It's the first time I've seen it in a zfs_member udev rule.

Revision history for this message
Colin Ian King (colin-king) wrote :

Sorry, I've been out of the loop for a couple of weeks because of an operation. So just checking, is the debdiff in comment #38 ok by you Richard?

Revision history for this message
Hajo Möller (dasjoe) wrote :

I don't see why we would need have ENV{ID_FS_USAGE}=="raid", wouldn't ENV{ID_FS_TYPE}=="zfs_member" suffice?

Revision history for this message
Richard Laager (rlaager) wrote :

The right fix is here:
https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1527727

If you bring 0.6.5.6 and patch Ubuntu's GRUB to set that environment variable, then the udev rules can go away entirely.

Otherwise, as to the debdiff in #38, it is fine, but the "raid" condition is unnecessary as per my and dasjoe's previous comments.

Changed in zfs-initramfs (Ubuntu):
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.