Gentoo installation with raid, lvm, luks and systemd

It's that time again. I have once again been contemplating switching away from Gentoo to Debian or Arch. But I just can't do it. The reason this issue came up again is that I once again tried to do a quick update this time moving to systemd. But I just put too little thought into it an messed my system up quite a bit. After some consideration I decided however that I won't move away from Gentoo. I have been using it for years, got so used to it and still have lot to learn so I'll stick with it. Another reason that is even better became crystal clear to me after this video Gentoo Linux, or Why in the World You Should Compile Everything. A source based distribution is not about speed but all about control over what gets onto and happens on your system. The idea of having all the parts out there and having them magically combine into a complete system just appeals to me. It is very much like I think of my own programming projects. You lay out all the pieces and then see them sort of self assemble anytime you want. This blog post will document my new fresh Gentoo install from within my currently running system.

Planned Configuration

  • GPT partition layout for UEFI boot
  • RAID1 mirror with 2 identical disks. I will start with just one disk and add the other one later.
  • LVM on top to easily manage how the space is used
  • LUKS encrypted LVM volumes for user home directories and swap
  • bare system with custom kernel and systemd working

After all that is working I will probably go with my trusty XFCE desktop again. So I will only install a minimal system to start with.

Preparation

I start this installation from within my current gentoo system. But you can of course use pretty much any Linux LiveCD or USB stick. If you do so I suggest to boot this CD or USB stick in UEFI mode and you won't need the awkward reboot in the middle. I recommend SystemRescueCD.

Since I am going to have encrypted volumes it is a good idea to initialize the drive with random data. Like this:

dd if=/dev/urandom bs=4096 | pv > /dev/sdb
65.6MiB 0:00:04 [16.3MiB/s] [>                                ]  0% ETA 32:19:40

This will take some time like the Pipe Viewer pv shows us. I will just fill the first 200 GB or so and the rest later. Why random data? First of all the drive I am using is used so I do not want any old stuff from whomever lurking around there. Secondly if we were to just zero the drive it would then later be much easier so see how much data is on the encrypted home volumes. Since we are using encryption we might as well do it right. Filled with random data first the volumes will just look like big binary blobs if you do not have the key. Regardless of whether just one byte is used or the whole volume.

Partitioning

UEFI

We will use UEFI with a GPT partition table. To create it we use parted.

parted /dev/sdb
(parted) mklabel GPT  
(parted) mkpart ESI fat32 0% 500m
(parted) set 1 boot on   
(parted) print                                                            
Model: ATA WDC WD2002FAEX-0 (scsi)
Disk /dev/sdb: 2000GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags:

Number  Start   End    Size   File system  Name  Flags
 1      1049kB  500MB  499MB               ESI   boot

This is the UEFI boot partition, where the UEFI capable BIOS will look for bootable files (in our case GRUB2). The "1049kB" confused me for a short while but is effectively sector 2048. So there is some free space in front of it should you need to squeeze GRUB in there for non-UEFI booting. We continue with the swap partition.

Swap

(parted) mkpart raid1-swap 500m 35g
(parted) set 2 raid on

I used 35 GB as an endpoint because this way I have a swap size of 32 GiB and a tiny bit extra. This is the maximum RAM size I expect this system to ever have (8 GiB at the moment). Why swap at all? Strictly speaking your system would most likely work fine without it since the 8 GiB in my case are plenty for most things. But there is no harm in having swap and I plan to use hibernation as well. You can choose the partition name "raid1-swap" freely. I toggled the "raid" flag as well to indicate to the kernel automounter that this will be a RAID partition. Now comes the root partition.

Root

(parted) mkpart raid1-root 35g 36g
(parted) set 3 raid on

1 GB for root should be plenty since I will have /usr, /home, /var and if necessary others on LVM. Now the partition for the LVM.

LVM

(parted) mkpart raid1-lvm 36g 200g
(parted) set 4 raid on

Simple enough. I used just 200 GB of this 2 TB disk because the rest of it has not yet been overwritten with random data.

A final look:

(parted) p                                                                
Model: ATA WDC WD2002FAEX-0 (scsi)
Disk /dev/sdb: 2000GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags:

Number  Start   End     Size    File system  Name        Flags
 1      1049kB  500MB   499MB                ESI         boot
 2      500MB   35.0GB  34.5GB               raid1-swap  raid
 3      35.0GB  36.0GB  999MB                raid1-root  raid
 4      36.0GB  200GB   164GB                raid1-lvm   raid

quit

File System, RAID and LVM preparation

UEFI

The UEFI system partition needs FAT32.

mkfs.vfat -F 32 /dev/sdb1

RAID

Now we set up the RAID superblocks for our swap, root and LVM partitions. You may note that I do not start with md0. I am configuring these from within my current Gentoo system which also happens to have a RAID. So I cannot start with md0. But that does not matter, they are not permanent names. If you run through this guide a second time or have the RAID partitions set up before for another reason you may get a note like "mdadm: Note: this array has metadata at the start". Make sure everything is ok and then you can override the old partitions. Also note that I start with just one drive and specify the second as "missing". I will add the other when my installation is complete and sync them.

mdadm --create /dev/md1 --level=mirror --raid-devices=2 /dev/sdb2 missing
mdadm --create /dev/md2 --level=mirror --raid-devices=2 /dev/sdb3 missing
mdadm --create /dev/md3 --level=mirror --raid-devices=2 /dev/sdb4 missing

Note: The arrays will be assembled in the initramfs so you can safely pick the default metadata version >= 1.2 even though the kernel itself does not understand it.

Check it:

cat /proc/mdstat
Personalities : [raid1] 
md3 : active raid1 sdb4[0]
      160025472 blocks super 1.2 [2/1] [U_]

md2 : active raid1 sdb3[0]
      975296 blocks super 1.2 [2/1] [U_]

md1 : active raid1 sdb2[0]
      33658752 blocks super 1.2 [2/1] [U_]

Our root file system is simple ext4.

mkfs.ext4 /dev/md2

LVM

First select the physical volumes for the volume groups. This is the complete md3.

pvcreate /dev/md3

Add it to a new volume group.

vgcreate vg /dev/md3

"vg" is just the name for the volume group. You can of course pick your own.

Now we have a place to set up some logical volumes. We will make some for /usr, /var and /home/user1.

lvcreate -L 20G -n usr vg
mkfs.ext4 /dev/vg/usr

lvcreate -L 20G -n var vg
lvcreate -L 40G -n home-user1 vg

/usr does not need to be encrypted because there is no actual user data in it. Just the applications which are open source anyway. I am not paranoid enough to encrypt which applications I installed. /var however usually contains mail and printer spool as well as database stuff.

Crypto Preparation

At this point we have three partitions that have to be encrypted:

1. swap (/dev/md1)
2. /var (/dev/vg/var)
3. /home/user1 (/dev/vg/home-user1)

Swap needs to be encrypted so that unencrypted user data does not get stored on the disk in case of swapping or hibernation. Pretty much the same goes for /var which holds database data as well as mail and printer spools. However if your system wakes up from hibernation you will not want an extra password but instead the password from a user such as user1 should work. The same goes for unlocking /var.

To achieve this is not that difficult because you can have more than one (8 to be exact) decryption keys per LUKS encrypted volume. So swap and /var get the root key (root password) as well as the regular users passwords (user1 and possibly others). The home partitions should of course only have one password.

This setup has however the one flaw that you will have to enter the password two or three times. Once or twice during boot for unlocking /var and swap. And once for logging in with your user. This can maybe be mitigated by using graphical overlays during the boot process such as plymouth which can cache the password temporarily.

Encrypting the Volumes

Now we initialize the partitions. I use the default key parameters. You will be asked for a password. For now swap and var will get my root password. The user user1 gets his respective password.

cryptsetup luksFormat /dev/md1
cryptsetup luksFormat /dev/vg/var
cryptsetup luksFormat /dev/vg/home-user1

The users should however also be able to open swap and /var with their user password. This way they can reboot or wake up from hibernation without root access. So we add user1's password to swap and /var.

cryptsetup luksAddKey /dev/md1
[enter root password and then enter user1 password]
cryptsetup luksAddKey /dev/vg/var
[enter root password and then enter user1 password]

Initializing File Systems

At this step we will open the encrypted volumes and initialize them with the respective file systems.

cryptsetup open /dev/md1 swap
mkswap /dev/mapper/swap

cryptsetup open /dev/vg/var var
mkfs.ext4 /dev/mapper/var

cryptsetup open /dev/vg/home-user1 home-user1
mkfs.ext4 /dev/mapper/home-user1

Of course you will need to enter the respective passwords when opening the partitions.

Gentoo Installation

We can now switch over to the regular Gentoo installation and start like this:

# activate swap
swapon /dev/mapper/swap
# mount root
mkdir /mnt/gentoo
mount /dev/md2 /mnt/gentoo
# mount usr
mkdir /mnt/gentoo/usr
mount /dev/vg/usr /mnt/gentoo/usr
# mount UEFI partition as boot
mkdir /mnt/gentoo/boot
mount /dev/sdb1 /mnt/gentoo/boot
# mounting var
mkdir /mnt/gentoo/var
mount /dev/mapper/var /mnt/gentoo/var

Continue the installation with downloading and unpacking the stage tarball as explained in the Gentoo handbook chapter 5 until you reach chapter 5.c.

Before we move on we quickly add a tmpfs for /tmp and /var/tmp. /tmp because we do not want to have any resources used during installation cropping up on the unencrypted root. And /var/tmp because it is used heavily by portage and there is no need to have it encrypting and decrypting all the time during installation.

# creating a tmpfs for /tmp
mount -t tmpfs tmpfs /mnt/gentoo/tmp
# and another one for /var/tmp
mount -t tmpfs tmpfs /mnt/gentoo/var/tmp

The default maximum size for a tmpfs is half your RAM. It will only be used as needed. Depending on the size of your RAM you may actually have to disable it and use the real /var/tmp for the really big package installations like LibreOffice an so forth.

Continue with chapter 5.c. In chapter 6.b I chose the profile "default/linux/amd64/13.0" for a minimal system. When you have reached paragraph about setting the timezone skip it and continue here. We will set the timezone with systemctl later on.

Locale

The locale will be set later on with systemctl but we will generate the desired locales here first. Edit /etc/locale.gen and activate the desired locales. I selected English and German.

en_US.UTF-8 UTF-8
de_DE.UTF-8 UTF-8

Then generate them with:

locale-gen

Root

While in the chroot don't forget to set your root password:

passwd

RAID Configuration

While dracut will auto assemble RAID arrays in our setup it is good idea to have an mdadm.conf to describe the current and desired layout.

emerge -av mdadm

Then set up an /etc/mdadm.conf file like this:

DEVICE /dev/sd[a-e]*
ARRAY /dev/md1 metadata=1.2 UUID=fa009f73:b43fab9c:42b39f4a:91e28858
ARRAY /dev/md2 metadata=1.2 UUID=6cf87b94:25775da3:7fdeefeb:b66dd3d4
ARRAY /dev/md3 metadata=1.2 UUID=139f2034:4005ccf4:b0f0b842:54796b0f

You can get the UUIDs with:

mdadm --misc -D /dev/md1

Of course you have to adapt this for your setup. To make it easy in case you are copy and pasting from this post anyway you can use this:

echo "DEVICE /dev/sd[a-e]*" >> /etc/mdadm.conf
for i in 1 2 3; do echo "ARRAY /dev/md$i metadata=1.2 UUID=$(mdadm --misc -D /dev/md$i | egrep -o '([0-9a-f]{8}:?){4}')" >> /etc/mdadm.conf ; done

This configuration will be read and used by dracut later on.

LVM Configuration

First of all install the lvm tools.

emerge -av sys-fs/lvm2

To enable the general automatic activation of we enable listening to lvmetad in the global section of /etc/lvm/lvm.conf

use_lvmetad = 1

The systemd unit lvm2-lvmetad.service will also have to be activated later on for this to work.

You can also download my lvm.conf for comparison.

We only want our volume group vg to be activated by default, so add the following in the activation section of /etc/lvm/lvm.conf

volume_list = [ "vg" ]

And to speed up the scanning for LVM volume groups in the devices section with:

filter = [ "r|/dev/nbd.*|", "a|/dev/md.*|" ]

This assumes all your physical LVM volumes are on /dev/md (i.e. RAID) devices.

Fstab

Here we define a proper fstab for our setup. I prefer using UUIDs here to avoid any confusion. You can easily find the UUIDs like this:

ls -l /dev/disk/by-uuid/
# or
blkid

For me this results in an fstab like this:

UUID=04C0-82C6                                  /boot           vfat            noauto,noatime  1 2
UUID=f202b721-03c7-4589-8898-1f4b7e882db2       /               ext4            noatime         0 1
UUID=2e554de3-cb35-470e-b542-446eeeeb24da       /usr            ext4            noatime         0 0
UUID=4c61fe1f-13ad-403a-b95f-3f7fa0596c4c       none            swap            sw              0 0
UUID=570592fc-fa46-4846-9d7f-418eb2f6dd95       /var            ext4            noatime         0 0
tmpfs                                           /var/tmp        tmpfs           nodev,nosuid    0 0

Remember to use the mapped (i.e.) opened UUIDs for LUKS encrypted volumes such as /var and swap. For example:

blkid | grep swap
/dev/mapper/swap: UUID="4c61fe1f-13ad-403a-b95f-3f7fa0596c4c" TYPE="swap"

I also tired to use /dev/mapper/var for var but that led to problems while trying to mount it during boot. It seems it is not a good idea to mix UUIDs and device paths. So I stick to UUIDs only.

Crypttab

Some of our encrypted partitions should be loaded at boot time. To enforce this we put them in the /etc/crypttab.

var     UUID=f531cc7f-a201-4bf1-80f2-e7a8621e7ff7

Emerging systemd

There are several things to consider for systemd. Check with the systemd wiki. I have copied the necessary parts for my setup below.

At the time of writing this step was recommended:

ln -sf /proc/self/mounts /etc/mtab

First of all we need systemd. Since we are using crypto we will need the cryptsetup useflag. Add it to your /etc/portage/make.conf. This is best done with euse from app-portage/gentoolkit.

emerge -av app-portage/gentoolkit
euse -E cryptsetup systemd gudev dbus

I added gudev and dbus because they will be required by NetworkManager anyway.

There is a circular dependency with dbus, so proceed like this:

USE="-systemd" emerge -av sys-apps/dbus
emerge -av sys-apps/systemd
emerge -av sys-apps/dbus

We also add the networkmanager, so that you can get your network up and running quickly.

emerge -av net-misc/networkmanager

Depending on your selected profile this may pull quite a bit of stuff.

Configuring systemd

To get our /var detected and activated we need to enable the lvm2-lvmetad.service. But since we are still in a chroot we cannot use systemctl directly. So we do it manually.

cd /etc/systemd/system
mkdir sysinit.target.wants
cd sysinit.target.wants
ln -s /usr/lib/systemd/system/lvm2-lvmetad.service

Configuring the Kernel

We are building our own kernel here. So you should especially know which device drivers to pick. If you are unsure also check with the handbook chapter 7. Make sure you build your kernel with UEFI options as well as RAID1 and LVM enabled.

emerge -av gentoo-sources
cd /usr/src/linux
make menuconfig

Then make sure to select:

Gentoo Linux > Support for init systems, system and service managers > [*] systemd
General setup > [*] Initial RAM filesystem and RAM disk (initramfs/initrd) support
Enable the block layer > Partition Types > [*]   EFI GUID Partition support
Processor type and features > [*] EFI runtime service support
Device Drivers > Generic Driver Options > () path to uevent helper
Device Drivers > Multiple devices driver support (RAID and LVM) > <*>   RAID support 
Device Drivers > Multiple devices driver support (RAID and LVM) > <*>   RAID-1 (mirroring) mode
Device Drivers > Multiple devices driver support (RAID and LVM) > <*>   Device mapper support > <*>     Crypt target support
Device Drivers > Multiple devices driver support (RAID and LVM) > <*>   Device mapper support > <*>     Mirror target
Device Drivers > Graphics support > Support for frame buffer devices > [*]   Enable firmware EDID
Device Drivers > Graphics support > Support for frame buffer devices > [*]   EFI-based Framebuffer Support
Device Drivers > Graphics support > Console display driver support > <*> Framebuffer Console support
Firmware Drivers > EFI (Extensible Firmware Interface)  Support > <*> EFI Variable Support via sysfs

You can also have a look at my .config file.

For the encryption I activated the following as well:

Cryptographic API > <*>   SHA256 digest algorithm (SSSE3/AVX/AVX2)
Cryptographic API > <*>   SHA512 digest algorithm (SSSE3/AVX/AVX2)
Cryptographic API > <*>   AES cipher algorithms (AES-NI)

If you forget the EFI based framebuffer it may later on appear as if your kernel hangs at boot because you won't see any further output.

Compile the kernel and copy it to /boot

make && make modules_install && make install

Dracut

Since our /usr is an LVM volume and /swap which may be needed for resuming from hibernation is encrypted we need an initramfs. In the old days we would have compiled whatever tools we need for the initramfs statically and fiddled them together. This makes upgrades rather painful. As a solution we now have is dracut.

Dracut basically inspects you environment and creates an initramfs to ensure all the essentials are there to mount the rootfs. And since we have a separate /usr and are using systemd we also have to make sure that gets mounted properly.

At the time of writing dracut was still in ~amd64 so we emerge it like this:

echo "sys-kernel/dracut" >> /etc/portage/package.keywords
echo 'DRACUT_MODULES="crypt lvm mdraid"' >> /etc/portage/make.conf
euse -E device-mapper
emerge -av dracut

Now we will add some configuration options to /etc/dracut.conf. Since Dracut does not assemble RAID or LVM by default we will select just the volumes we need. Activating everything with 'rd.auto=1' would be slower. The first 3 items are our RAID md devices. You can get the UUIDs like this:

emerge -av sys-fs/mdadm
mdadm --misc -D /dev/md1 | grep UUID

They have to be in the mdadm format with ':' and not '-'. The fourth line just tells dracut to activate the vg/usr LVM volume.

kernel_cmdline+=" rd.md.uuid=fa009f73:b43fab9c:42b39f4a:91e28858 "  
kernel_cmdline+=" rd.md.uuid=6cf87b94:25775da3:7fdeefeb:b66dd3d4 "  
kernel_cmdline+=" rd.md.uuid=139f2034:4005ccf4:b0f0b842:54796b0f "  
kernel_cmdline+=" rd.lvm.lv=vg/usr "  
kernel_cmdline+=" rd.luks.uuid=49130618-5eeb-4746-8e68-4237149eafe4 "
kernel_cmdline+=" resume=UUID=4c61fe1f-13ad-403a-b95f-3f7fa0596c4c "
add_dracutmodules+="crypt lvm mdraid resume"
mdadmconf="yes"

The spaces at between " and content are important !

The next 2 lines are the encrypted swap partition and the unencrypted swap volume. Listing the encrypted partition here with rd.luks.uuid will instruct dracut to ask for the password. The kernel will then see the open/unencrypted version as the swap partition to use for resuming from hibernation.

You can get them like this:

blkid | egrep '(/md1|/swap)'

Then we add the needed modules with add_dracutmodules+="crypt lvm mdraid resume".

And finally we enable our local mdadm.conf in the last line.

Then we generate the initramfs with the following command:

dracut --force --hostonly --add-device=/dev/disk/by-uuid/49130618-5eeb-4746-8e68-4237149eafe4 '' 3.10.17-gentoo

Adapt to your kernel version of course. Without the explicit version number it may not work if you are just in a chroot booted from a different kernel version. The --add-device option points to the raid device with the swap partition on it. It is necessary because otherwise dracut might not pick it up and booting will fail with something like dracut-initqueue: Failed to issue method call: Unit systemd-cryptsetup@luks\x2d...service failed to load: No such file or directory.

Or if you don't like to type the kernel version:

dracut --force --hostonly --add-device=/dev/disk/by-uuid/49130618-5eeb-4746-8e68-4237149eafe4 '' $(ls -l /usr/src/linux | sed 's/.*linux-//')

Note: when your system is completely set up with keyboard layout and so forth you may want to regenerate the dracut initramfs once more. Otherwise it can happen that you have the wrong keyboard layout when you try to enter the passwords for /var and swap during boot.

Installing GRUB2

Add the desired platform to your make.conf and emerge grub2

echo 'GRUB_PLATFORMS="efi-64"' >> /etc/portage/make.conf
emerge -av sys-boot/grub:2
mkdir /boot/grub

Now uncomment and modify the default command line in /etc/default/grub:

GRUB_CMDLINE_LINUX="init=/usr/lib/systemd/systemd"

Build the configuration:

grub2-mkconfig -o /boot/grub/grub.cfg

The output should contain something like this:

Found linux image: /boot/vmlinuz-3.10.17-gentoo
Found initrd image: /boot/initramfs-3.10.17-gentoo.img

I got some errors like "/usr/sbin/grub2-probe: warning: Couldn't find physical volume '(null)'", but that does not seem to be a problem.

Now comes the annoying part. I am currently working in a chroot started from a non-UEFI system. You cannot really install grub2 UEFI properly from a non UEFI system. So we have to use an UEFI capable Linux live disk to boot. I always have a SystemRescueCD USB stick lying around. So I will use that one. Just be sure to select UEFI boot. On my mainboard this means pressing F11 during boot and then selecting UEFI: USB .. .

You will of course have to manually put all the parts of your new system together and chroot. For me this means:

mount /dev/md2 /mnt/gentoo
mount /dev/vg/usr /mnt/gentoo/usr
mount /dev/sdb1 /mnt/gentoo/boot
cryptsetup open /dev/vg/var var
mount /dev/mapper/var /mnt/gentoo/var/
mount -t tmpfs none /mnt/gentoo/tmp
mount -t tmpfs none /mnt/gentoo/var/tmp
cp -L /etc/resolv.conf /mnt/gentoo/etc/
mount -t proc none /mnt/gentoo/proc
mount --rbind /sys /mnt/gentoo/sys
mount --rbind /dev /mnt/gentoo/dev
chroot /mnt/gentoo /bin/bash

If the live stick is a bit older you might need to use cryptsetup luksOpen /dev/vg/var var.

And in the chroot:

source /etc/profile
export PS1="(chroot) $PS1"

Then you can install grub2 like this:

grub2-install --efi-directory=/boot

This should not produce any errors, but instead just show the current boot order. In /boot you should find an EFI directory now and a new boot entry 'gentoo' has been added to the NVRAM of the UEFI BIOS.

The Reboot

During the installation you may have seen some complaints about an unset hostname or a missing locale. This is because they should be set through systemd. That is however only possible if the system was booted with systemd. So it is time to to reboot now. If you have followed the steps closely your system should boot properly. You will be asked the password for swap and /var during boot. Should something fail, reboot your old system or LiveCD and chroot into your new installation. Even though systemd will not work in the chroot you can use 'journalctl' to diagnose boot errors.

Note: I had some issues with the lvm2-lvmetad.service not starting correctly and therfore /var not being mounted. The system then hangs after "Reached target swap". This seems somehow related to creating the link for starting it manually (see "Configuring systemd" above). If that happens wait 30-60 seconds and you will be offered a root shell for maintenance. Login and do the following:

rm -rf /etc/systemd/system/sysinit.target.wants
systemctl enable lvm2-lvmetad.service
reboot

Configuring systemd

Systemd is new territory for me so I summed up my most important settings in the following paragraphs.

I have watched quite a few videos and read some articles on systemd before this new install. There was some controversy, but I personnally really like it. I suggest to just give it a try.

Setting the Locale

localectl set-locale LANG=en_US.utf8
localectl set-keymap de-latin1-nodeadkeys

If you have skipped the locale section earlier you have to generate the locales here first. You can do this as usual by editing /etc/locale.gen and running locale-gen.

Setting the Hostname

hostnamectl set-hostname testhost

Replace 'testhost' with your desired hostname.

Setting the Timezone

timedatectl set-timezone Europe/Berlin

Getting a Network Connection

systemctl enable NetworkManager
systemctl start NetworkManager

This will symlink the NetworkManager.service file from /usr/lib/systemd/system to /etc/systemd/system. This also means you can see all available service files in /usr/lib/systemd/system and if you like put your own in /etc/systemd/system.

You can check what the NetworkManager has logged like this:

journalctl -u NetworkManager

In my case it complained about the missing NetworkManager.conf but got my wired connection working with dhcp anyway.

Regular Users

Until now we have done everything as root and do of course need to add regular users for everyday work. Since each of them has an own LUKS encrypted LVM volume and should also be able to unlock swap and /var some additional steps are required.

Automounting /home

The base system is now up and running. We do however still need to add some functionality so that our user home directories will be mounted on login.

For this we will use pam_mount.

emerge -av pam_mount

And in /etc/pam.d/system-auth add the following in the top block:

auth            optional        pam_mount.so

And this in the bottom block:

session         optional        pam_mount.so

Adding a User

Adding a user requires some extra work. The following assumes the user name 'user1'.

Add a Regular User

useradd -m user1
passwd user1

Add Encrypted Home Volume

Use the same password you used with passwd here.

lvcreate -L 10G -n home-user1 vg
cryptsetup luksFormat /dev/vg/home-user1
cryptsetup open /dev/vg/home-user1 home-user1
mkfs.ext4 /dev/mapper/home-user1
mount /dev/mapper/home-user1 /home/user1
chmod 0700 /home/user1
chown user1:user1 /home/user1
umount /home/user1
cryptsetup luksClose /dev/mapper/home-user1

Automounting Home Volume

Now edit /etc/security/pam_mount.conf.xml and add:

<volume user="user1" mountpoint="/home/user1" path="/dev/vg/home-user1" fstype="crypt" options="noatime" />

Put it before the last '</pam_mount>'.

Now logout or switch to another tty and log in as 'user1'. It should automatically mount the encrypted lvm volume as your home. Since your home is in the root of an ext4 volume so you should see the default 'lost+found' directory in there.

If you want a newly added userto ba able to unlock /var and swap during boot you need to add that password to the /var and swap volumes as shown in the paragraph "Encrypting the Volumes" above.

Changing the Password

Changing Keys

If you want to change a users password you also need to change it on to the swap and /var LUKS volumes.

cryptsetup luksChangeKey /dev/md1
[enter old and new password]
cryptsetup luksChangeKey /dev/vg/var
[enter old and new password]

Changing Password

The user can do this himself with the regular:

passwd

Removing a User

Removing a user also requires some steps. Be sure not to forget to remove the key from swap and /var*. Also the user needs to be logged out. The following steps assume the user name 'user1'.

Removing Keys

/dev/md1 ist the swap partition.

cryptsetup luksRemoveKey /dev/md1
[enter user1 password]
cryptsetup luksRemoveKey /dev/vg/var
[enter user1 password]

If you do not have the password you can remove a key with luksKillSlot. You do however need to know which keyslot it is. If you are expecting this to happen you should keep track of the assigned keyslots when creating the users.

Removing LVM volume and /home

lvremove vg/home-user1

Removing User Account

userdel -r user1

Hibernation/Suspend

Hibernation (suspend to disk) should work out of the box with this setup. You can try it with:

systemctl hibernate

On reboot you will then be asked the password (i.e. one of the passwords) for the swap partition by dracut and the system should resume from hibernation.

Resizing

Remember the beginning of this post when I filled just part of the disk with random data to initialize it because it took so long? Well I have filled the rest with random data and now it is time to use the rest of the space. This means:

1. increasing the size of the RAID1 array 
2. increasing the size of the pysical volume belonging to the volume group
3. add some space to a users home volume
4. resize the LUKS volume on it
5. resize the contained ext4 volume

Since /usr is on the LVM on this RAID you will need to resize it by booting from another system. For example a LiveCD.

Stopping LUKS, LVM and RAID

If you decide to resize directly after you have made all the steps above or are using a LiveCD which activates most of the volumes automatically you will need to close LUKS volumes, deactivate volume groups and stop the RAID array.

cryptsetup close /dev/mapper/var
# check which drives (i.e. RAID array) belong to our volume group
pvs
# deactivate volume group
vgchange -a n vg
# stop raid
mdadm --manage -S /dev/md3

Changing the RAID Partitions

If your RAID already has 2 drives you will need to do the following steps for both drives.

parted /dev/sdb
(parted) print
Model: ATA WDC WD2002FAEX-0 (scsi)
Disk /dev/sdb: 2000GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags:

Number  Start   End     Size    File system  Name     Flags
 1      1049kB  500MB   499MB   fat32        ESI      boot
 2      500MB   35.0GB  34.5GB               primary  raid
 3      35.0GB  50.0GB  15.0GB               primary  raid
 4      50.0GB  200GB   150GB                primary  raid
(parted) rm 4
(parted) mkpart raid1-lvm 50g 250g
(parted) set 4 raid on
(parted) quit

Re-assembling and Resizing the RAID Array

mdadm --assemble /dev/md3 /dev/sdb4
mdadm --grow /dev/md3 -z max --assume-clean

The grow step is important, because mdadm will not automatically use the new size of the underlying partition on reassembly.

Resizing the LVM volume

Resize the physical LVM volume to the partition size.

pvresize /dev/md3

Enlarging a home Volume

Adding 10GB to user1's home volume

lvresize -L +10G vg/home-user1

And then activate the volume

lvchange -a y vg/home-user1

Enlarging a LUKS Volume

cryptsetup resize /dev/vg/home-user1

Enlarging the ext4 File System

Open the encrypted volume:

cryptsetup open /dev/vg/home-user1 home-user1

Check it:

e2fsck -f /dev/mapper/home-user1

And resize the file system:

resize2fs /dev/mapper/home-user1

You see there are quite a few steps necessary but all the tools are there and well established.

A Quick Performance Test

After the whole installation I thought to myself: "So many layers? This must all be slow."

emerge -av bonnie++

On the bare RAID (i.e root):

bonnie++ -d /root -u root -s 10g -r 0

For this test I had an installation with a significantly bigger root. If you have followed the steps above exactly "10g" will not be possible.

I do not recommend to use '-u root' on a system with important data on it. My test system is still empty at this point. So it is ok.

Version  1.96       ------Sequential Output------ --Sequential Input- --Random-
Concurrency   1     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
testhost       10G  1062  98 142701   7 65568   4  5765  94 207092   7 711.3   9

With /home/user1 opened and mounted:

bonnie++ -d /home/user1 -u root -s 10g -r 0

And the result:

Version  1.96       ------Sequential Output------ --Sequential Input- --Random-
Concurrency   1     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
testhost       10G  1000  97 128191   7 67962   5  5517  91 176171   6 692.1   8

Thats around 10% slower than without encryption. So it barely matters. 10% are well worth the added security I think.

List of used Software Versions

  • gentoo-sources-3.10.17
  • sys-fs/lvm2-2.02.103
  • sys-fs/cryptsetup-1.6.2
  • sys-kernel/dracut-034-r4
  • sys-apps/systemd-208-r2
  • sys-boot/grub-2.00_p5107-r2
  • sys-auth/pam_mount-2.14

All on amd64 architecture.

Conclusion / Known Issues

I guess this post is long enough and I better stop now before I squeeze in more stuff. There is still a lot to do. The next step will probably be to mess around a bit more with systemd, go through the logs and check everything.

At this time there are the following issues:

  • the dracut (swap) and (var) password inputs get overdrawn which is somewhat confusing
  • some logging for example from NetworkManager gets written to the console
  • pam_mount complains about something when a user logs in

These do however not seem very severe issues. I will probably address them in my next post.

social