summaryrefslogtreecommitdiff
path: root/fs/binfmt_elf.c
diff options
context:
space:
mode:
authorEric W. Biederman <ebiederm@xmission.com>2020-12-10 21:47:57 +0300
committerEric W. Biederman <ebiederm@xmission.com>2020-12-10 21:52:05 +0300
commit125c00af3b2c498875b275c4ad932b4db2c6bae2 (patch)
tree8390573b7ab598a4ddb548f9d2ede8f3ec7bbd2f /fs/binfmt_elf.c
parent3650b228f83adda7e5ee532e2b90429c03f7b9ec (diff)
parentfa67bf885e5211c7dce9514ef2877212c0a5e09e (diff)
downloadlinux-125c00af3b2c498875b275c4ad932b4db2c6bae2.tar.xz
exec: Move unshare_files and guarantee files_struct.count is correct
A while ago it was reported that posix file locking goes wrong when a multi-threaded process calls exec. I looked into the history and this is definitely a regression, that should be fixed if we can. This set of changes cleanups of the code in exec so hopefully this code will not regress again. Then it adds helpers and fixes the users of files_struct so the reference count is only incremented if COPY_FILES is passed to clone (or if io_uring takes a reference). Then it removes helpers (get_files_struct, __install_fd, __alloc_fd, __close_fd) that are no longer needed and if used would encourage code that increments the count of files_struct somewhere besides in clone when COPY_FILES is passed. In addition to fixing the bug in exec and simplifing the code this set of changes by virtue of getting files_struct.count correct it optimizes fdget. With proc and other places not temporarily increasing the count on files_struct __fget_light should succeed more often in being able to return a struct file without touching it's reference count. Fixing the count in files_struct was suggested by Oleg[1]. For those that are interested in the history of this issue I have included as much of it as I could find in the first change. Since v1: - Renamed the functions __fcheck_files -> files_lookup_fd_raw fcheck_files -> files_lookup_fd_locked fcheck_files -> files_lookup_fd_rcu fcheck_files -> lookup_fd_rcu fcheck_task -> task_lookup_fd_rcu fnext_task -> task_lookup_next_fd_rcu __close_fd_get_file -> close_fd_get_file - Simplified get_file_raw_ptr - Removed ksys_close - Examined the penalty for taking task_lock. The helper task_lookup_next_fd_rcu takes task_lock each iteration. Concern was expressed that this might be a problem. The function tid_fd_mode isn called from tid_fd_revalidate which is called when ever a file descriptor file is stat'ed, opened, or otherwise accessed. The function tid_fd_mode histrocally called get_files_struct which took and dropped task_lock. So the volume of task_lock calls is already proportional to the number of file descriptors. A micro benchmark did not see the move to task_lookup_next_fd_rcu making a difference in performance. Which suggests that the change to taking the task lock for every file descriptor found in task_lookup_next_fd will not be a problem. - Reviewed the code for conflicts with io_uring (especially the removal of get_files_struct). To my surprise no conflicts were found as io_uring does not use standard helpers but instead rolls it's own version of get_files_struct by hand. Documentation/filesystems/files.rst | 8 +- arch/powerpc/platforms/cell/spufs/coredump.c | 2 +- drivers/android/binder.c | 2 +- fs/autofs/dev-ioctl.c | 5 +- fs/coredump.c | 5 +- fs/exec.c | 29 +++---- fs/file.c | 124 +++++++++++++-------------- fs/io_uring.c | 2 +- fs/locks.c | 14 +-- fs/notify/dnotify/dnotify.c | 2 +- fs/open.c | 2 +- fs/proc/fd.c | 48 ++++------- include/linux/fdtable.h | 40 +++++---- include/linux/syscalls.h | 12 --- kernel/bpf/syscall.c | 20 +---- kernel/bpf/task_iter.c | 44 +++------- kernel/fork.c | 12 +-- kernel/kcmp.c | 29 ++----- 18 files changed, 153 insertions(+), 247 deletions(-) Eric W. Biederman (25): exec: Don't open code get_close_on_exec exec: Move unshare_files to fix posix file locking during exec exec: Simplify unshare_files exec: Remove reset_files_struct kcmp: In kcmp_epoll_target use fget_task bpf: In bpf_task_fd_query use fget_task proc/fd: In proc_fd_link use fget_task file: Rename __fcheck_files to files_lookup_fd_raw file: Factor files_lookup_fd_locked out of fcheck_files file: Replace fcheck_files with files_lookup_fd_rcu file: Rename fcheck lookup_fd_rcu file: Implement task_lookup_fd_rcu proc/fd: In tid_fd_mode use task_lookup_fd_rcu kcmp: In get_file_raw_ptr use task_lookup_fd_rcu file: Implement task_lookup_next_fd_rcu proc/fd: In proc_readfd_common use task_lookup_next_fd_rcu bpf/task_iter: In task_file_seq_get_next use task_lookup_next_fd_rcu proc/fd: In fdinfo seq_show don't use get_files_struct file: Merge __fd_install into fd_install file: In f_dupfd read RLIMIT_NOFILE once. file: Merge __alloc_fd into alloc_fd file: Rename __close_fd to close_fd and remove the files parameter file: Replace ksys_close with close_fd file: Rename __close_fd_get_file close_fd_get_file file: Remove get_files_struct [1] https://lkml.kernel.org/r/20180915160423.GA31461@redhat.com v1: https://lkml.kernel.org/r/87ft8l6ic3.fsf@x220.int.ebiederm.org Reported-by: Jeff Layton <jlayton@redhat.com> Reported-by: Daniel P. Berrangé <berrange@redhat.com> Suggested-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lkml.kernel.org/r/87r1on1v62.fsf@x220.int.ebiederm.org Link: https://lists.openvz.org/pipermail/criu/2020-November/045123.html Link: https://marc.info/?l=openvz-criu&m=160591423214257 Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Diffstat (limited to 'fs/binfmt_elf.c')
0 files changed, 0 insertions, 0 deletions