[lvc-project] [PATCH] make new mount API honour SB_NOUSER (was Re: [PATCH] block: Avoid mounting the bdev pseudo-filesystem in userspace)
Al Viro
viro at zeniv.linux.org.uk
Tue Jun 2 17:54:56 MSK 2026
On Tue, Jun 02, 2026 at 04:23:21PM +0300, Arefev wrote:
> The sequence of system calls before the crash could be as follows:
>
> fsopen("bdev", ...)
> fsconfig(fd_fs, FSCONFIG_CMD_CREATE, 0,0,0)
> fsmount(fd_fs, 0,0)
> move_mount(fd_mnt, "", AT_FDCWD, "./file1", 0x46ul)
Huh? "file1" being a regular file or was it actually
a directory? AFAICS, the d_is_dir() mismatch would be rejected
by do_move_mount()...
> The system call executed at the time of the cras:
>
> open("/dev/media0", ...);
>
> Simplified stacktrace:
>
> path_openat
> |-> link_path_walk
> |-> walk_component
> |-> __lookup_slow
> |-> ld = inode->i_op->lookup(inode, dentry, flags); <- Oops
How the hell does that thing bound on top of "./file1" lead to
resolution of "/dev/media0" walking anywhere near it? Something's
missing here.
> Checking the fc->sb_flags flag before calling vfs_create_mount() is a great
> idea,
> if it helps prevent crashes in two more file systems, 'sockfs' and 'pipefs'.
Calling vfs_create_mount() is not a problem; refusing to attach
the result if SB_NOUSER has ended up in ->s_flags is the right
thing to do, but I still would like to understand how did this call
of walk_component() manage to evade
if (unlikely(!d_can_lookup(nd->path.dentry))) {
if (nd->flags & LOOKUP_RCU) {
if (!try_to_unlazy(nd))
return -ECHILD;
}
return -ENOTDIR;
}
on the previous iteration through link_path_walk() or, if it had been
the first one, the corresponding checks at chroot()/chdir()/fchdir() time.
Note that there are very legitimate objects with NULL ->lookup() - every
regular file is like that, obviously, but there also exist ones that look
like directories in mode bits, but still have NULL ->lookup(). See
d_flags_for_inode() and look for DCACHE_AUTODIR_TYPE there.
So whatever scenario has played out, you've got a call of walk_component()
with nd->path.dentry that should have failed d_can_lookup(). That ought
to have been prevented and this prevention would better be much closer
than anything fsmount(2) does.
Don't get me wrong - userland mounting of bdev and friends should not be
allowed, but that's not the only thing that went wrong in the reproducer.
BTW, how easy to trigger it is? Is that "you need to run for a few months
on a bunch of boxen" or "run this sequence and it'll crash that way"?
More information about the lvc-project
mailing list