summaryrefslogtreecommitdiff
path: root/lib/vdso
AgeCommit message (Collapse)AuthorFilesLines
2020-08-06vdso/treewide: Add vdso_data pointer argument to __arch_get_hw_counter()Thomas Gleixner1-2/+2
MIPS already uses and S390 will need the vdso data pointer in __arch_get_hw_counter(). This works nicely as long as the architecture does not support time namespaces in the VDSO. With time namespaces enabled the regular accessor to the vdso data pointer __arch_get_vdso_data() will return the namespace specific VDSO data page for tasks which are part of a non-root time namespace. This would cause the architectures which need the vdso data pointer in __arch_get_hw_counter() to access the wrong vdso data page. Add a vdso_data pointer argument to __arch_get_hw_counter() and hand it in from the call sites in the core code. For architectures which do not need the data pointer in their counter accessor function the compiler will just optimize it out. Fix up all existing architecture implementations and make MIPS utilize the pointer instead of invoking the accessor function. No functional change and no change in the resulting object code (except MIPS). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/draft-87wo2ekuzn.fsf@nanos.tec.linutronix.de
2020-06-12Merge tag 'x86-urgent-2020-06-11' of ↵Linus Torvalds1-0/+11
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull more x86 updates from Thomas Gleixner: "A set of fixes and updates for x86: - Unbreak paravirt VDSO clocks. While the VDSO code was moved into lib for sharing a subtle check for the validity of paravirt clocks got replaced. While the replacement works perfectly fine for bare metal as the update of the VDSO clock mode is synchronous, it fails for paravirt clocks because the hypervisor can invalidate them asynchronously. Bring it back as an optional function so it does not inflict this on architectures which are free of PV damage. - Fix the jiffies to jiffies64 mapping on 64bit so it does not trigger an ODR violation on newer compilers - Three fixes for the SSBD and *IB* speculation mitigation maze to ensure consistency, not disabling of some *IB* variants wrongly and to prevent a rogue cross process shutdown of SSBD. All marked for stable. - Add yet more CPU models to the splitlock detection capable list !@#%$! - Bring the pr_info() back which tells that TSC deadline timer is enabled. - Reboot quirk for MacBook6,1" * tag 'x86-urgent-2020-06-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/vdso: Unbreak paravirt VDSO clocks lib/vdso: Provide sanity check for cycles (again) clocksource: Remove obsolete ifdef x86_64: Fix jiffies ODR violation x86/speculation: PR_SPEC_FORCE_DISABLE enforcement for indirect branches. x86/speculation: Prevent rogue cross-process SSBD shutdown x86/speculation: Avoid force-disabling IBPB based on STIBP and enhanced IBRS. x86/cpu: Add Sapphire Rapids CPU model number x86/split_lock: Add Icelake microserver and Tigerlake CPU models x86/apic: Make TSC deadline timer detection message visible x86/reboot/quirks: Add MacBook6,1 reboot quirk
2020-06-09lib/vdso: Provide sanity check for cycles (again)Thomas Gleixner1-0/+11
The original x86 VDSO implementation checked for the validity of the clock source read by testing whether the returned signed cycles value is less than zero. This check was also used by the vdso read function to signal that the current selected clocksource is not VDSO capable. During the rework of the VDSO code the check was removed and replaced with a check for the clocksource mode being != NONE. This turned out to be a mistake because the check is necessary for paravirt and hyperv clock sources. The reason is that these clock sources have their own internal sequence counter to validate the clocksource at the point of reading it. This is necessary because the hypervisor can invalidate the clocksource asynchronously so a check during the VDSO data update is not sufficient. Having a separate indicator for the validity is slower than just validating the cycles value. The check for it being negative turned out to be the fastest implementation and safe as it would require an uptime of ~73 years with a 4GHz counter frequency to result in a false positive. Add an optional function to validate the cycles with a default implementation which allows the compiler to optimize it out for architectures which do not require it. Fixes: 5d51bee725cc ("clocksource: Add common vdso clock mode storage") Reported-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Miklos Szeredi <mszeredi@redhat.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20200606221531.963970768@linutronix.de
2020-06-03lib/vdso: Force inlining of __cvdso_clock_gettime_common()Christophe Leroy1-1/+1
When adding gettime64() to a 32 bit architecture (namely powerpc/32) it has been noticed that GCC doesn't inline anymore __cvdso_clock_gettime_common() because it is called twice (Once by __cvdso_clock_gettime() and once by __cvdso_clock_gettime32). This has the effect of seriously degrading the performance: Before the implementation of gettime64(), gettime() runs in: clock-gettime-monotonic-raw: vdso: 1003 nsec/call clock-gettime-monotonic-coarse: vdso: 592 nsec/call clock-gettime-monotonic: vdso: 942 nsec/call When adding a gettime64() entry point, the standard gettime() performance is degraded by 30% to 50%: clock-gettime-monotonic-raw: vdso: 1300 nsec/call clock-gettime-monotonic-coarse: vdso: 900 nsec/call clock-gettime-monotonic: vdso: 1232 nsec/call Adding __always_inline() to __cvdso_clock_gettime_common() regains the original performance. In terms of code size, the inlining increases the code size by only 176 bytes. This is in the noise for a kernel image. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/1ab6a62c356c3bec35d1623563ef9c636205bcda.1588079622.git.christophe.leroy@c-s.fr
2020-03-21lib/vdso: Enable common headersVincenzo Frascino1-22/+0
The vDSO library should only include the necessary headers required for a userspace library (UAPI and a minimal set of kernel headers). To make this possible it is necessary to isolate from the kernel headers the common parts that are strictly necessary to build the library. Refactor the unified vdso code to use the common headers. Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20200320145351.32292-26-vincenzo.frascino@arm.com
2020-02-17lib/vdso: Allow architectures to provide the vdso data pointerChristophe Leroy1-16/+56
On powerpc, __arch_get_vdso_data() clobbers the link register, requiring the caller to save it. As the parent function already has to set a stack frame and saves the link register before calling the C vdso function, retrieving the vdso data pointer there is less overhead. Split out the functional code from the __cvdso.*() interfaces into new static functions which can either be called from the existing interfaces with the vdso data pointer supplied via __arch_get_vdso_data() or directly from ASM code. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Link: https://lore.kernel.org/r/abf97996602ef07223fec30c005df78e5ed41b2e.1580399657.git.christophe.leroy@c-s.fr Link: https://lkml.kernel.org/r/20200207124403.965789141@linutronix.de
2020-02-17lib/vdso: Allow architectures to override the ns shift operationChristophe Leroy1-2/+9
On powerpc/32, GCC (8.1) generates pretty bad code for the ns >>= vd->shift operation taking into account that the shift is always <= 32 and the upper part of the result is likely to be zero. GCC makes reversed assumptions considering the shift to be likely >= 32 and the upper part to be like not zero. unsigned long long shift(unsigned long long x, unsigned char s) { return x >> s; } results in: 00000018 <shift>: 18: 35 25 ff e0 addic. r9,r5,-32 1c: 41 80 00 10 blt 2c <shift+0x14> 20: 7c 64 4c 30 srw r4,r3,r9 24: 38 60 00 00 li r3,0 28: 4e 80 00 20 blr 2c: 54 69 08 3c rlwinm r9,r3,1,0,30 30: 21 45 00 1f subfic r10,r5,31 34: 7c 84 2c 30 srw r4,r4,r5 38: 7d 29 50 30 slw r9,r9,r10 3c: 7c 63 2c 30 srw r3,r3,r5 40: 7d 24 23 78 or r4,r9,r4 44: 4e 80 00 20 blr Even when forcing the shift to be smaller than 32 with an &= 31, it still considers the shift as likely >= 32. Move the default shift implementation into an inline which can be redefined in architecture code via a macro. [ tglx: Made the shift argument u32 and removed the __arch prefix ] Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Link: https://lore.kernel.org/r/b3d449de856982ed060a71e6ace8eeca4654e685.1580399657.git.christophe.leroy@c-s.fr Link: https://lkml.kernel.org/r/20200207124403.857649978@linutronix.de
2020-02-17lib/vdso: Allow fixed clock modeChristophe Leroy1-2/+9
Some architectures have a fixed clocksource which is known at compile time and cannot be replaced or disabled at runtime, e.g. timebase on PowerPC. For such cases the clock mode check in the VDSO code is pointless. Move the check for a VDSO capable clocksource into an inline function and allow architectures to redefine it via a macro. [ tglx: Removed the #ifdef mess ] Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Link: https://lkml.kernel.org/r/20200207124403.748756829@linutronix.de
2020-02-17lib/vdso: Move VCLOCK_TIMENS to vdso_clock_modesThomas Gleixner1-8/+10
Move the time namespace indicator clock mode to the other ones for consistency sake. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Link: https://lkml.kernel.org/r/20200207124403.656097274@linutronix.de
2020-02-17lib/vdso: Cleanup clock mode storage leftoversThomas Gleixner2-15/+5
Now that all architectures are converted to use the generic storage the helpers and conditionals can be removed. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Link: https://lkml.kernel.org/r/20200207124403.470699892@linutronix.de
2020-02-17clocksource: Add common vdso clock mode storageThomas Gleixner2-2/+14
All architectures which use the generic VDSO code have their own storage for the VDSO clock mode. That's pointless and just requires duplicate code. Provide generic storage for it. The new Kconfig symbol is intermediate and will be removed once all architectures are converted over. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Link: https://lkml.kernel.org/r/20200207124403.028046322@linutronix.de
2020-02-17lib/vdso: Allow the high resolution parts to be compiled outThomas Gleixner1-0/+11
If the architecture knows at compile time that there is no VDSO capable clocksource supported it makes sense to optimize the guts of the high resolution parts of the VDSO out at build time. Add a helper function to check whether the VDSO should be high resolution capable and provide a stub which can be overridden by an architecture. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Link: https://lkml.kernel.org/r/20200207124402.530143168@linutronix.de
2020-01-16lib/vdso: Only read hrtimer_res when needed in __cvdso_clock_getres()Christophe Leroy1-3/+1
Only perform READ_ONCE(vd[CS_HRES_COARSE].hrtimer_res) for HRES and RAW clocks. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/7ac2f0d21652f95e2bbdfa6bd514ae6c7caf53ab.1579196675.git.christophe.leroy@c-s.fr
2020-01-14lib/vdso: Prepare for time namespace supportThomas Gleixner2-4/+144
To support time namespaces in the vdso with a minimal impact on regular non time namespace affected tasks, the namespace handling needs to be hidden in a slow path. The most obvious place is vdso_seq_begin(). If a task belongs to a time namespace then the VVAR page which contains the system wide vdso data is replaced with a namespace specific page which has the same layout as the VVAR page. That page has vdso_data->seq set to 1 to enforce the slow path and vdso_data->clock_mode set to VCLOCK_TIMENS to enforce the time namespace handling path. The extra check in the case that vdso_data->seq is odd, e.g. a concurrent update of the vdso data is in progress, is not really affecting regular tasks which are not part of a time namespace as the task is spin waiting for the update to finish and vdso_data->seq to become even again. If a time namespace task hits that code path, it invokes the corresponding time getter function which retrieves the real VVAR page, reads host time and then adds the offset for the requested clock which is stored in the special VVAR page. If VDSO time namespace support is disabled the whole magic is compiled out. Initial testing shows that the disabled case is almost identical to the host case which does not take the slow timens path. With the special timens page installed the performance hit is constant time and in the range of 5-7%. For the vdso functions which are not using the sequence count an unconditional check for vdso_data->clock_mode is added which switches to the real vdso when the clock_mode is VCLOCK_TIMENS. [avagin: Make do_hres_timens() work with raw clocks too: choose vdso_data pointer by CS_RAW offset.] Suggested-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrei Vagin <avagin@gmail.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20191112012724.250792-21-dima@arista.com
2020-01-14lib/vdso: Mark do_hres() and do_coarse() as __always_inlineAndrei Vagin1-6/+8
Performance numbers for Intel(R) Core(TM) i5-6300U CPU @ 2.40GHz (more clock_gettime() cycles - the better): clock | before | after | diff ---------------------------------------------------------- monotonic | 153222105 | 166775025 | 8.8% monotonic-coarse | 671557054 | 691513017 | 3.0% monotonic-raw | 147116067 | 161057395 | 9.5% boottime | 153446224 | 166962668 | 9.1% The improvement for arm64 for monotonic and boottime is around 3.5%. clock | before | after | diff ================================================== monotonic 17326692 17951770 3.6% monotonic-coarse 43624027 44215292 1.3% monotonic-raw 17541809 17554932 0.1% boottime 17334982 17954361 3.5% [ tglx: Avoid the goto ] Signed-off-by: Andrei Vagin <avagin@gmail.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20191112012724.250792-3-dima@arista.com
2020-01-14lib/vdso: Avoid duplication in __cvdso_clock_getres()Christophe Leroy1-6/+1
VDSO_HRES and VDSO_RAW clocks are handled the same way. Avoid the code duplication. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Andy Lutomirski <luto@kernel.org> Link: https://lore.kernel.org/r/fdf1a968a8f7edd61456f1689ac44082ebb19c15.1577111367.git.christophe.leroy@c-s.fr
2020-01-14lib/vdso: Let do_coarse() return 0 to simplify the callsiteChristophe Leroy1-7/+8
do_coarse() is similar to do_hres() except that it never fails. Change its type to int instead of void and let it always return success (0) to simplify the call site. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/21e8afa38c02ca8672c2690307383507fe63b454.1577111367.git.christophe.leroy@c-s.fr
2020-01-14lib/vdso: Remove checks on return value for 32 bit vDSOVincenzo Frascino1-5/+5
Since all the architectures that support the generic vDSO library have been converted to support the 32 bit fallbacks it is not required anymore to check the return value of __cvdso_clock_get*time32_common() before updating the old_timespec fields. Remove the related checks from the generic vdso library. References: c60a32ea4f45 ("lib/vdso/32: Provide legacy syscall fallbacks") Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20190830135902.20861-6-vincenzo.frascino@arm.com
2020-01-14lib/vdso: Remove VDSO_HAS_32BIT_FALLBACKVincenzo Frascino1-10/+0
VDSO_HAS_32BIT_FALLBACK was introduced to address a regression which caused seccomp to deny access to the applications to clock_gettime64() and clock_getres64() because they are not enabled in the existing filters. The purpose of VDSO_HAS_32BIT_FALLBACK was to simplify the conditional implementation of __cvdso_clock_get*time32() variants. Now that all the architectures that support the generic vDSO library have been converted to support the 32 bit fallbacks the conditional can be removed. Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20190830135902.20861-5-vincenzo.frascino@arm.com References: c60a32ea4f45 ("lib/vdso/32: Provide legacy syscall fallbacks")
2020-01-14lib/vdso: Build 32 bit specific functions in the right contextVincenzo Frascino1-0/+4
clock_gettime32 and clock_getres_time32 should be compiled only with a 32 bit vdso library. Exclude these symbols when BUILD_VDSO32 is not defined. Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Andy Lutomirski <luto@kernel.org> Link: https://lore.kernel.org/r/20190830135902.20861-3-vincenzo.frascino@arm.com
2020-01-10lib/vdso: Make __cvdso_clock_getres() staticVincenzo Frascino1-0/+1
Fix the following sparse warning in the generic vDSO library: linux/lib/vdso/gettimeofday.c:224:5: warning: symbol '__cvdso_clock_getres' was not declared. Should it be static? Make it static and also mark it __maybe_unsed. Fixes: 502a590a170b ("lib/vdso: Move fallback invocation to the callers") Reported-by: Marc Gonzalez <marc.w.gonzalez@free.fr> Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20191128111719.8282-1-vincenzo.frascino@arm.com
2019-11-15y2038: vdso: change time_t to __kernel_old_time_tArnd Bergmann1-2/+2
Only x86 uses the 'time' syscall in vdso, so change that to __kernel_old_time_t as a preparation for removing 'time_t' and '__kernel_time_t' later. Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-10-23lib/vdso: Make clock_getres() POSIX compliant againThomas Gleixner1-4/+5
A recent commit removed the NULL pointer check from the clock_getres() implementation causing a test case to fault. POSIX requires an explicit NULL pointer check for clock_getres() aside of the validity check of the clock_id argument for obscure reasons. Add it back for both 32bit and 64bit. Note, this is only a partial revert of the offending commit which does not bring back the broken fallback invocation in the the 32bit compat implementations of clock_getres() and clock_gettime(). Fixes: a9446a906f52 ("lib/vdso/32: Remove inconsistent NULL pointer checks") Reported-by: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Christophe Leroy <christophe.leroy@c-s.fr> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1910211202260.1904@nanos.tec.linutronix.de
2019-10-07lib: vdso: Remove CROSS_COMPILE_COMPAT_VDSOVincenzo Frascino1-9/+0
arm64 was the last architecture using CROSS_COMPILE_COMPAT_VDSO config option. With this patch series the dependency in the architecture has been removed. Remove CROSS_COMPILE_COMPAT_VDSO from the Unified vDSO library code. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-07-31lib/vdso/32: Provide legacy syscall fallbacksThomas Gleixner1-1/+11
To address the regression which causes seccomp to deny applications the access to clock_gettime64() and clock_getres64() syscalls because they are not enabled in the existing filters. That trips over the fact that 32bit VDSOs use the new clock_gettime64() and clock_getres64() syscalls in the fallback path. Add a conditional to invoke the 32bit legacy fallback syscalls instead of the new 64bit variants. The conditional can go away once all architectures are converted. Fixes: 00b26474c2f1 ("lib/vdso: Provide generic VDSO implementation") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Sean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com> Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1907301134470.1738@nanos.tec.linutronix.de
2019-07-31lib/vdso: Move fallback invocation to the callersThomas Gleixner1-17/+36
To allow syscall fallbacks using the legacy 32bit syscall for 32bit VDSO builds, move the fallback invocation out into the callers. Split the common code out of __cvdso_clock_gettime/getres() and invoke the syscall fallback in the 64bit and 32bit variants. Preparatory work for using legacy syscalls in 32bit VDSO. No functional change. Fixes: 00b26474c2f1 ("lib/vdso: Provide generic VDSO implementation") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Reviewed-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Link: https://lkml.kernel.org/r/20190728131648.695579736@linutronix.de
2019-07-31lib/vdso/32: Remove inconsistent NULL pointer checksThomas Gleixner1-16/+2
The 32bit variants of vdso_clock_gettime()/getres() have a NULL pointer check for the timespec pointer. That's inconsistent vs. 64bit. But the vdso implementation will never be consistent versus the syscall because the only case which it can handle is NULL. Any other invalid pointer will cause a segfault. So special casing NULL is not really useful. Remove it along with the superflouos syscall fallback invocation as that will return -EFAULT anyway. That also gets rid of the dubious typecast which only works because the pointer is NULL. Fixes: 00b26474c2f1 ("lib/vdso: Provide generic VDSO implementation") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Reviewed-by: Andy Lutomirski <luto@kernel.org> Link: https://lkml.kernel.org/r/20190728131648.587523358@linutronix.de
2019-06-26lib/vdso: Make delta calculation work correctlyThomas Gleixner1-4/+15
The x86 vdso implementation on which the generic vdso library is based on has subtle (unfortunately undocumented) twists: 1) The code assumes that the clocksource mask is U64_MAX which means that no bits are masked. Which is true for any valid x86 VDSO clocksource. Stupidly it still did the mask operation for no reason and at the wrong place right after reading the clocksource. 2) It contains a sanity check to catch the case where slightly unsynchronized TSC values can be observed which would cause the delta calculation to make a huge jump. It therefore checks whether the current TSC value is larger than the value on which the current conversion is based on. If it's not larger the base value is used to prevent time jumps. #1 Is not only stupid for the X86 case because it does the masking for no reason it is also completely wrong for clocksources with a smaller mask which can legitimately wrap around during a conversion period. The core timekeeping code does it correct by applying the mask after the delta calculation: (now - base) & mask #2 is equally broken for clocksources which have smaller masks and can wrap around during a conversion period because there the now > base check is just wrong and causes stale time stamps and time going backwards issues. Unbreak it by: 1) Removing the mask operation from the clocksource read which makes the fallback detection work for all clocksources 2) Replacing the conditional delta calculation with a overrideable inline function. #2 could reuse clocksource_delta() from the timekeeping code but that results in a significant performance hit for the x86 VSDO. The timekeeping core code must have the non optimized version as it has to operate correctly with clocksources which have smaller masks as well to handle the case where TSC is discarded as timekeeper clocksource and replaced by HPET or pmtimer. For the VDSO there is no replacement clocksource. If TSC is unusable the syscall is enforced which does the right thing. To accommodate to the needs of various architectures provide an override-able inline function which defaults to the regular delta calculation with masking: (now - base) & mask Override it for x86 with the non-masking and checking version. This unbreaks the ARM64 syscall fallback operation, allows to use clocksources with arbitrary width and preserves the performance optimization for x86. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: linux-arch@vger.kernel.org Cc: LAK <linux-arm-kernel@lists.infradead.org> Cc: linux-mips@vger.kernel.org Cc: linux-kselftest@vger.kernel.org Cc: catalin.marinas@arm.com Cc: Will Deacon <will.deacon@arm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: linux@armlinux.org.uk Cc: Ralf Baechle <ralf@linux-mips.org> Cc: paul.burton@mips.com Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: salyzyn@android.com Cc: pcc@google.com Cc: shuah@kernel.org Cc: 0x7f454c46@gmail.com Cc: linux@rasmusvillemoes.dk Cc: huw@codeweavers.com Cc: sthotton@marvell.com Cc: andre.przywara@arm.com Cc: Andy Lutomirski <luto@kernel.org> Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1906261159230.32342@nanos.tec.linutronix.de
2019-06-22lib/vdso: Add compat supportVincenzo Frascino1-0/+4
Some 64 bit architectures have support for 32 bit applications that require a separate version of the vDSOs. Add support to the generic code for compat fallback functions. Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Shijith Thotton <sthotton@marvell.com> Tested-by: Andre Przywara <andre.przywara@arm.com> Cc: linux-arch@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-mips@vger.kernel.org Cc: linux-kselftest@vger.kernel.org Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Russell King <linux@armlinux.org.uk> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul Burton <paul.burton@mips.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Mark Salyzyn <salyzyn@android.com> Cc: Peter Collingbourne <pcc@google.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Huw Davies <huw@codeweavers.com> Link: https://lkml.kernel.org/r/20190621095252.32307-10-vincenzo.frascino@arm.com
2019-06-22lib/vdso: Provide generic VDSO implementationVincenzo Frascino3-0/+282
In the last few years the kernel gained quite some architecture specific vdso implementations which contain very similar code. Introduce a generic VDSO implementation of gettimeofday() which will be shareable between architectures once they are converted over. The implementation is based on the current x86 VDSO code. [ tglx: Massaged changelog and made the kernel doc tabular ] Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Shijith Thotton <sthotton@marvell.com> Tested-by: Andre Przywara <andre.przywara@arm.com> Cc: linux-arch@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-mips@vger.kernel.org Cc: linux-kselftest@vger.kernel.org Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Russell King <linux@armlinux.org.uk> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul Burton <paul.burton@mips.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Mark Salyzyn <salyzyn@android.com> Cc: Peter Collingbourne <pcc@google.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Huw Davies <huw@codeweavers.com> Link: https://lkml.kernel.org/r/20190621095252.32307-3-vincenzo.frascino@arm.com