summaryrefslogtreecommitdiff
path: root/arch/x86/lib/copy_user_uncached_64.S
AgeCommit message (Collapse)AuthorFilesLines
2023-10-03x86/headers: Replace #include <asm/export.h> with #include <linux/export.h>Masahiro Yamada1-1/+1
The following commit: ddb5cdbafaaa ("kbuild: generate KSYMTAB entries by modpost") deprecated <asm/export.h>, which is now a wrapper of <linux/export.h>. Use <linux/export.h> in *.S as well as in *.c files. After all the <asm/export.h> lines are replaced, <asm/export.h> and <asm-generic/export.h> will be removed. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20230806145958.380314-2-masahiroy@kernel.org
2023-04-21x86: rewrite '__copy_user_nocache' functionLinus Torvalds1-0/+242
I didn't really want to do this, but as part of all the other changes to the user copy loops, I've been looking at this horror. I tried to clean it up multiple times, but every time I just found more problems, and the way it's written, it's just too hard to fix them. For example, the code is written to do quad-word alignment, and will use regular byte accesses to get to that point. That's fairly simple, but it means that any initial 8-byte alignment will be done with cached copies. However, the code then is very careful to do any 4-byte _tail_ accesses using an uncached 4-byte write, and that was claimed to be relevant in commit a82eee742452 ("x86/uaccess/64: Handle the caching of 4-byte nocache copies properly in __copy_user_nocache()"). So if you do a 4-byte copy using that function, it carefully uses a 4-byte 'movnti' for the destination. But if you were to do a 12-byte copy that is 4-byte aligned, it would _not_ do a 4-byte 'movnti' followed by a 8-byte 'movnti' to keep it all uncached. Instead, it would align the destination to 8 bytes using a byte-at-a-time loop, and then do a 8-byte 'movnti' for the final 8 bytes. The main caller that cares is __copy_user_flushcache(), which knows about this insanity, and has odd cases for it all. But I just can't deal with looking at this kind of "it does one case right, and another related case entirely wrong". And the code really wasn't fixable without hard drugs, which I try to avoid. So instead, rewrite it in a form that hopefully not only gets this right, but is a bit more maintainable. Knock wood. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>