summaryrefslogtreecommitdiff
path: root/meta-phosphor/recipes-phosphor/initrdscripts/phosphor-mmc-init/mmc-init.sh
diff options
context:
space:
mode:
authorAndrew Jeffery <andrew@aj.id.au>2021-04-01 15:06:14 +0300
committerBrad Bishop <bradleyb@fuzziesquirrel.com>2021-04-30 14:28:38 +0300
commitb5cbe9bbba6740d13bbdc9091ec1fddc946edc2f (patch)
tree2b53340fedbfa241c79bcebdeabdffe992bdd791 /meta-phosphor/recipes-phosphor/initrdscripts/phosphor-mmc-init/mmc-init.sh
parent6014707b9a95f3ca0604eff41561d159540ad39f (diff)
downloadopenbmc-b5cbe9bbba6740d13bbdc9091ec1fddc946edc2f.tar.xz
phosphor-mmc-init: exec switch_root(8) rather than chroot(1)
It was found that perf(1) had some issues with recording and analysing data on Rainier systems: ``` root@rainier:~# perf probe --add mem_serial_in root@rainier:~# perf record -e probe:mem_serial_in -aR sleep 1 [ perf record: Woken up 1 times to write data ] assertion failed at util/namespaces.c:257 No kallsyms or vmlinux with build-id e4e9c7cff1deb3bf32958039c696f094dc76cf5c was found [ perf record: Captured and wrote 0.377 MB perf.data (25 samples) ] root@rainier:~# perf script -v build id event received for [kernel.kallsyms]: e4e9c7cff1deb3bf32958039c696f094dc76cf5c broken or missing trace data incompatible file format (rerun with -v to learn more) ``` Starting with the failed assertion in the recording, we find the relevant code is the following WARN_ON_ONCE(): ``` void nsinfo__mountns_exit(struct nscookie *nc) { ... if (nc->oldcwd) { WARN_ON_ONCE(chdir(nc->oldcwd)); zfree(&nc->oldcwd); } ``` A strace of `perf record` demonstrates the relevant syscall sequence, where /home/root is the working directory at the time when `perf record` is invoked. ``` openat(AT_FDCWD, "/proc/self/ns/mnt", O_RDONLY|O_LARGEFILE) = 12 openat(AT_FDCWD, "/proc/142/ns/mnt", O_RDONLY|O_LARGEFILE) = 13 setns(13, CLONE_NEWNS) = 0 statx(AT_FDCWD, "/mnt/rofs/bin/udevadm", AT_STATX_SYNC_AS_STAT|AT_NO_AUTOMOUNT, STATX_BASIC_STATS, {stx_mask=STATX_BASIC_STATS|0x1000, stx_attributes=0, stx_mode=S_IFREG|0755, stx_size=978616, ...}) = 0 openat(AT_FDCWD, "/mnt/rofs/bin/udevadm", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 14 setns(12, CLONE_NEWNS) = 0 chdir("/home/root") = -1 ENOENT (No such file or directory) ``` From the path of the binary, PID 142 is executing in an unanticipated environment. Its path is representative of the state of the filesystem prior to the initramfs handing over to /sbin/init in the real root, suggesting an issue with the initramfs' /init implementation. In /init we find a bunch of setup to discover and mount the root device. At the end of the script we prepare for the real root by exec'ing chroot. From `man 2 chroot`[0]: ``` DESCRIPTION chroot() changes the root directory of the calling process to that speci‐ fied in path. This directory will be used for pathnames beginning with /. The root directory is inherited by all children of the calling process. ``` Specifically, this outlines that chroot(2) affects the state of the calling *process* and not the state of mount namespace in use by the process. Further, a call to `setns(..., CLONE_NEWNS)` explicitly replaces the mount namespace for the *process*, and as such destroys any chroot state that might have been associated with the process' original mount namespace. As the chroot state is not a property of a mount namespace, switching *back* to the application's original mount namespace does not restore the process' original chroot state. As such, the chdir(2) from the strace output above returns an error, as the get_current_dir_name(3) call that yielded the provided path was issued prior to switching into the target process' mount namespace, and was thus derived in the chroot context. The path is therefore invalid once the original mount namespace is restored via the second setns(2) as the process has (already) lost the chroot context for the original namespace. For perf(1) to work in its current implementation the effective root for PID 1 must remain the absolute path "/" with respect to the kernel's VFS layer. This requires /init to use either pivot_root(1) or switch_root(1). pivot_root(1) is ruled out by its own man-page[1]: ``` NOTES ... The rootfs (initial ramfs) cannot be pivot_root()ed. The recommended method of changing the root filesystem in this case is to delete every‐ thing in rootfs, overmount rootfs with the new root, attach stdin/std‐ out/stderr to the new /dev/console, and exec the new init(1). Helper pro‐ grams for this process exist; see switch_root(8). ... ``` As noted, the recommendation is a description of the switch_root(8) application[2]. The details of why the specific sequence for switch_root(8) is necessary is documented in [3]. Change /init to use switch_root(8) to avoid the nasty interaction of chroot(2) and setns(2). [0] https://man7.org/linux/man-pages/man2/chroot.2.html#DESCRIPTION [1] https://man7.org/linux/man-pages/man2/pivot_root.2.html#NOTES [2] https://man7.org/linux/man-pages/man8/switch_root.8.html [3] https://git.busybox.net/busybox/tree/util-linux/switch_root.c?h=1_32_1#n298 Change-Id: Iac29b53a462b03559d18fe9b600aefcd1951057e Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Diffstat (limited to 'meta-phosphor/recipes-phosphor/initrdscripts/phosphor-mmc-init/mmc-init.sh')
-rw-r--r--meta-phosphor/recipes-phosphor/initrdscripts/phosphor-mmc-init/mmc-init.sh2
1 files changed, 1 insertions, 1 deletions
diff --git a/meta-phosphor/recipes-phosphor/initrdscripts/phosphor-mmc-init/mmc-init.sh b/meta-phosphor/recipes-phosphor/initrdscripts/phosphor-mmc-init/mmc-init.sh
index 061757519..41c1ad32b 100644
--- a/meta-phosphor/recipes-phosphor/initrdscripts/phosphor-mmc-init/mmc-init.sh
+++ b/meta-phosphor/recipes-phosphor/initrdscripts/phosphor-mmc-init/mmc-init.sh
@@ -63,4 +63,4 @@ for f in $fslist; do
mount --move $f $rodir/$f
done
-exec chroot $rodir /sbin/init
+exec switch_root $rodir /sbin/init