summaryrefslogtreecommitdiff
path: root/arch/x86/kernel/cpu/mce/internal.h
AgeCommit message (Collapse)AuthorFilesLines
2023-12-15x86/mce: Handle Intel threshold interrupt stormsTony Luck1-0/+2
Add an Intel specific hook into machine_check_poll() to keep track of per-CPU, per-bank corrected error logs (with a stub for the CONFIG_MCE_INTEL=n case). When a storm is observed the rate of interrupts is reduced by setting a large threshold value for this bank in IA32_MCi_CTL2. This bank is added to the bitmap of banks for this CPU to poll. The polling rate is increased to once per second. When a storm ends reset the threshold in IA32_MCi_CTL2 back to 1, remove the bank from the bitmap for polling, and change the polling rate back to the default. If a CPU with banks in storm mode is taken offline, the new CPU that inherits ownership of those banks takes over management of storm(s) in the inherited bank(s). The cmci_discover() function was already very large. These changes pushed it well over the top. Refactor with three helper functions to bring it back under control. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20231115195450.12963-4-tony.luck@intel.com
2023-12-15x86/mce: Add per-bank CMCI storm mitigationTony Luck1-1/+57
This is the core functionality to track CMCI storms at the machine check bank granularity. Subsequent patches will add the vendor specific hooks to supply input to the storm detection and take actions on the start/end of a storm. machine_check_poll() is called both by the CMCI interrupt code, and for periodic polls from a timer. Add a hook in this routine to maintain a bitmap history for each bank showing whether the bank logged an corrected error or not each time it is polled. In normal operation the interval between polls of these banks determines how far to shift the history. The 64 bit width corresponds to about one second. When a storm is observed a CPU vendor specific action is taken to reduce or stop CMCI from the bank that is the source of the storm. The bank is added to the bitmap of banks for this CPU to poll. The polling rate is increased to once per second. During a storm each bit in the history indicates the status of the bank each time it is polled. Thus the history covers just over a minute. Declare a storm for that bank if the number of corrected interrupts seen in that history is above some threshold (defined as 5 in this series, could be tuned later if there is data to suggest a better value). A storm on a bank ends if enough consecutive polls of the bank show no corrected errors (defined as 30, may also change). That calls the CPU vendor specific function to revert to normal operational mode, and changes the polling rate back to the default. [ bp: Massage. ] Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20231115195450.12963-3-tony.luck@intel.com
2023-12-15x86/mce: Remove old CMCI storm mitigation codeTony Luck1-6/+0
When a "storm" of corrected machine check interrupts (CMCI) is detected this code mitigates by disabling CMCI interrupt signalling from all of the banks owned by the CPU that saw the storm. There are problems with this approach: 1) It is very coarse grained. In all likelihood only one of the banks was generating the interrupts, but CMCI is disabled for all. This means Linux may delay seeing and processing errors logged from other banks. 2) Although CMCI stands for Corrected Machine Check Interrupt, it is also used to signal when an uncorrected error is logged. This is a problem because these errors should be handled in a timely manner. Delete all this code in preparation for a finer grained solution. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com> Tested-by: Yazen Ghannam <yazen.ghannam@amd.com> Link: https://lore.kernel.org/r/20231115195450.12963-2-tony.luck@intel.com
2023-10-16x86/mce: Cleanup mce_usable_address()Yazen Ghannam1-0/+2
Move Intel-specific checks into a helper function. Explicitly use "bool" for return type. No functional change intended. Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20230613141142.36801-4-yazen.ghannam@amd.com
2023-10-16x86/mce: Define amd_mce_usable_address()Yazen Ghannam1-0/+2
Currently, all valid MCA_ADDR values are assumed to be usable on AMD systems. However, this is not correct in most cases. Notifiers expecting usable addresses may then operate on inappropriate values. Define a helper function to do AMD-specific checks for a usable memory address. List out all known cases. [ bp: Tone down the capitalized words. ] Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20230613141142.36801-3-yazen.ghannam@amd.com
2023-08-18x86/MCE: Always save CS register on AMD Zen IF Poison errorsYazen Ghannam1-1/+4
The Instruction Fetch (IF) units on current AMD Zen-based systems do not guarantee a synchronous #MC is delivered for poison consumption errors. Therefore, MCG_STATUS[EIPV|RIPV] will not be set. However, the microarchitecture does guarantee that the exception is delivered within the same context. In other words, the exact rIP is not known, but the context is known to not have changed. There is no architecturally-defined method to determine this behavior. The Code Segment (CS) register is always valid on such IF unit poison errors regardless of the value of MCG_STATUS[EIPV|RIPV]. Add a quirk to save the CS register for poison consumption from the IF unit banks. This is needed to properly determine the context of the error. Otherwise, the severity grading function will assume the context is IN_KERNEL due to the m->cs value being 0 (the initialized value). This leads to unnecessary kernel panics on data poison errors due to the kernel believing the poison consumption occurred in kernel context. Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230814200853.29258-1-yazen.ghannam@amd.com
2023-07-21x86/mce: Prevent duplicate error recordsBorislav Petkov (AMD)1-0/+1
A legitimate use case of the MCA infrastructure is to have the firmware log all uncorrectable errors and also, have the OS see all correctable errors. The uncorrectable, UCNA errors are usually configured to be reported through an SMI. CMCI, which is the correctable error reporting interrupt, uses SMI too and having both enabled, leads to unnecessary overhead. So what ends up happening is, people disable CMCI in the wild and leave on only the UCNA SMI. When CMCI is disabled, the MCA infrastructure resorts to polling the MCA banks. If a MCA MSR is shared between the logical threads, one error ends up getting logged multiple times as the polling runs on every logical thread. Therefore, introduce locking on the Intel side of the polling routine to prevent such duplicate error records from appearing. Based on a patch by Aristeu Rozanski <aris@ruivo.org>. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Tested-by: Tony Luck <tony.luck@intel.com> Acked-by: Aristeu Rozanski <aris@ruivo.org> Link: https://lore.kernel.org/r/20230515143225.GC4090740@cathedrallabs.org
2023-03-08x86/mce: Always inline old MCA stubsBorislav Petkov (AMD)1-5/+5
The stubs for the ancient MCA support (CONFIG_X86_ANCIENT_MCE) are normally optimized away on 64-bit builds. However, an allmodconfig one causes the compiler to add sanitizer calls gunk into them and they exist as constprop calls. Which objtool then complains about: vmlinux.o: warning: objtool: do_machine_check+0xad8: call to \ pentium_machine_check.constprop.0() leaves .noinstr.text section due to them missing noinstr. One could tag them "noinstr" but what should really happen is, they should be forcefully inlined so that all that gunk gets optimized away and the warning doesn't even have a chance to fire. Do so. No functional changes. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20230222191054.4701-1-bp@alien8.de
2022-12-29x86/mce: Add support for Extended Physical Address MCA changesSmita Koralahalli1-1/+30
Newer AMD CPUs support more physical address bits. That is, the MCA_ADDR registers on Scalable MCA systems contain the ErrorAddr in bits [56:0] instead of [55:0]. Hence, the existing LSB field from bits [61:56] in MCA_ADDR must be moved around to accommodate the larger ErrorAddr size. MCA_CONFIG[McaLsbInStatusSupported] indicates this change. If set, the LSB field will be found in MCA_STATUS rather than MCA_ADDR. Each logical CPU has unique MCA bank in hardware and is not shared with other logical CPUs. Additionally, on SMCA systems, each feature bit may be different for each bank within same logical CPU. Check for MCA_CONFIG[McaLsbInStatusSupported] for each MCA bank and for each CPU. Additionally, all MCA banks do not support maximum ErrorAddr bits in MCA_ADDR. Some banks might support fewer bits but the remaining bits are marked as reserved. [ Yazen: Rebased and fixed up formatting. bp: Massage comments. ] Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20221206173607.1185907-5-yazen.ghannam@amd.com
2022-12-29x86/mce: Define a function to extract ErrorAddr from MCA_ADDRSmita Koralahalli1-0/+15
Move MCA_ADDR[ErrorAddr] extraction into a separate helper function. This will be further refactored to support extended ErrorAddr bits in MCA_ADDR in newer AMD CPUs. [ bp: Massage. ] Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com> Link: https://lore.kernel.org/all/20220225193342.215780-3-Smita.KoralahalliChannabasappa@amd.com/
2022-06-28x86/mce: Check whether writes to MCA_STATUS are getting ignoredSmita Koralahalli1-1/+1
The platform can sometimes - depending on its settings - cause writes to MCA_STATUS MSRs to get ignored, regardless of HWCR[McStatusWrEn]'s value. For further info see PPR for AMD Family 19h, Model 01h, Revision B1 Processors, doc ID 55898 at https://bugzilla.kernel.org/show_bug.cgi?id=206537. Therefore, probe for ignored writes to MCA_STATUS to determine if hardware error injection is at all possible. [ bp: Heavily massage commit message and patch. ] Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20220214233640.70510-2-Smita.KoralahalliChannabasappa@amd.com
2022-02-23x86/mce: Remove the tolerance level controlBorislav Petkov1-2/+1
This is pretty much unused and not really useful. What is more, all relevant MCA hardware has recoverable machine checks support so there's no real need to tweak MCA tolerance levels in order to *maybe* extend machine lifetime. So rip it out. Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/YcDq8PxvKtTENl/e@zn.tnic
2022-02-19x86/mce: Work around an erratum on fast string copy instructionsJue Wang1-1/+4
A rare kernel panic scenario can happen when the following conditions are met due to an erratum on fast string copy instructions: 1) An uncorrected error. 2) That error must be in first cache line of a page. 3) Kernel must execute page_copy from the page immediately before that page. The fast string copy instructions ("REP; MOVS*") could consume an uncorrectable memory error in the cache line _right after_ the desired region to copy and raise an MCE. Bit 0 of MSR_IA32_MISC_ENABLE can be cleared to disable fast string copy and will avoid such spurious machine checks. However, that is less preferable due to the permanent performance impact. Considering memory poison is rare, it's desirable to keep fast string copy enabled until an MCE is seen. Intel has confirmed the following: 1. The CPU erratum of fast string copy only applies to Skylake, Cascade Lake and Cooper Lake generations. Directly return from the MCE handler: 2. Will result in complete execution of the "REP; MOVS*" with no data loss or corruption. 3. Will not result in another MCE firing on the next poisoned cache line due to "REP; MOVS*". 4. Will resume execution from a correct point in code. 5. Will result in the same instruction that triggered the MCE firing a second MCE immediately for any other software recoverable data fetch errors. 6. Is not safe without disabling the fast string copy, as the next fast string copy of the same buffer on the same CPU would result in a PANIC MCE. This should mitigate the erratum completely with the only caveat that the fast string copy is disabled on the affected hyper thread thus performance degradation. This is still better than the OS crashing on MCEs raised on an irrelevant process due to "REP; MOVS*' accesses in a kernel context, e.g., copy_page. Tested: Injected errors on 1st cache line of 8 anonymous pages of process 'proc1' and observed MCE consumption from 'proc2' with no panic (directly returned). Without the fix, the host panicked within a few minutes on a random 'proc2' process due to kernel access from copy_page. [ bp: Fix comment style + touch ups, zap an unlikely(), improve the quirk function's readability. ] Signed-off-by: Jue Wang <juew@google.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20220218013209.2436006-1-juew@google.com
2022-02-14x86/mce: Use arch atomic and bit helpersBorislav Petkov1-2/+21
The arch helpers do not have explicit KASAN instrumentation. Use them in noinstr code. Inline a couple more functions with single call sites, while at it: mce_severity_amd_smca() has a single call-site which is noinstr so force the inlining and fix: vmlinux.o: warning: objtool: mce_severity_amd.constprop.0()+0xca: call to \ mce_severity_amd_smca() leaves .noinstr.text section Always inline mca_msr_reg(): text data bss dec hex filename 16065240 128031326 36405368 180501934 ac23dae vmlinux.before 16065240 128031294 36405368 180501902 ac23d8e vmlinux.after and mce_no_way_out() as the latter one is used only once, to fix: vmlinux.o: warning: objtool: mce_read_aux()+0x53: call to mca_msr_reg() leaves .noinstr.text section vmlinux.o: warning: objtool: do_machine_check()+0xc9: call to mce_no_way_out() leaves .noinstr.text section Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Marco Elver <elver@google.com> Link: https://lore.kernel.org/r/20220204083015.17317-4-bp@alien8.de
2021-12-13x86/mce: Use mce_rdmsrl() in severity checking codeBorislav Petkov1-0/+2
MCA has its own special MSR accessors. Use them. No functional changes. Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20211208111343.8130-4-bp@alien8.de
2021-11-02Merge tag 'ras_core_for_v5.16_rc1' of ↵Linus Torvalds1-19/+40
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RAS updates from Borislav Petkov: - Get rid of a bunch of function pointers used in MCA land in favor of normal functions. This is in preparation of making the MCA code noinstr-aware - When the kernel copies data from user addresses and it encounters a machine check, a SIGBUS is sent to that process. Change this action to either an -EFAULT which is returned to the user or a short write, making the recovery action a lot more user-friendly * tag 'ras_core_for_v5.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mce: Sort mca_config members to get rid of unnecessary padding x86/mce: Get rid of the ->quirk_no_way_out() indirect call x86/mce: Get rid of msr_ops x86/mce: Get rid of machine_check_vector x86/mce: Get rid of the mce_severity function pointer x86/mce: Drop copyin special case for #MC x86/mce: Change to not send SIGBUS error during copy from user
2021-09-23x86/mce: Sort mca_config members to get rid of unnecessary paddingBorislav Petkov1-6/+6
$ pahole -C mca_config arch/x86/kernel/cpu/mce/core.o before: /* size: 40, cachelines: 1, members: 16 */ /* sum members: 21, holes: 1, sum holes: 3 */ /* sum bitfield members: 64 bits, bit holes: 2, sum bit holes: 32 bits */ /* padding: 4 */ /* last cacheline: 40 bytes */ after: /* size: 32, cachelines: 1, members: 16 */ /* padding: 3 */ /* last cacheline: 32 bytes */ No functional changes. Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20210922165101.18951-6-bp@alien8.de
2021-09-23x86/mce: Get rid of the ->quirk_no_way_out() indirect callBorislav Petkov1-1/+4
Use a flag setting to call the only quirk function for that. No functional changes. Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20210922165101.18951-5-bp@alien8.de
2021-09-23x86/mce: Get rid of msr_opsBorislav Petkov1-6/+6
Avoid having indirect calls and use a normal function which returns the proper MSR address based on ->smca setting. No functional changes. Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20210922165101.18951-4-bp@alien8.de
2021-09-23x86/mce: Get rid of machine_check_vectorBorislav Petkov1-5/+24
Get rid of the indirect function pointer and use flags settings instead to steer execution. Now that it is not an indirect call any longer, drop the instrumentation annotation for objtool too. No functional changes. Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20210922165101.18951-3-bp@alien8.de
2021-09-23x86/mce: Get rid of the mce_severity function pointerBorislav Petkov1-2/+1
Turn it into a normal function which calls an AMD- or Intel-specific variant depending on the CPU it runs on. No functional changes. Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20210922165101.18951-2-bp@alien8.de
2021-09-13x86/extable: Rework the exception table mechanicsThomas Gleixner1-10/+0
The exception table entries contain the instruction address, the fixup address and the handler address. All addresses are relative. Storing the handler address has a few downsides: 1) Most handlers need to be exported 2) Handlers can be defined everywhere and there is no overview about the handler types 3) MCE needs to check the handler type to decide whether an in kernel #MC can be recovered. The functionality of the handler itself is not in any way special, but for these checks there need to be separate functions which in the worst case have to be exported. Some of these 'recoverable' exception fixups are pretty obscure and just reuse some other handler to spare code. That obfuscates e.g. the #MC safe copy functions. Cleaning that up would require more handlers and exports Rework the exception fixup mechanics by storing a fixup type number instead of the handler address and invoke the proper handler for each fixup type. Also teach the extable sort to leave the type field alone. This makes most handlers static except for special cases like the MCE MSR fixup and the BPF fixup. This allows to add more types for cleaning up the obscure places without adding more handler code and exports. There is a marginal code size reduction for a production config and it removes _eight_ exported symbols. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Alexei Starovoitov <ast@kernel.org> Link: https://lkml.kernel.org/r/20210908132525.211958725@linutronix.de
2021-09-13x86/mce: Get rid of stray semicolonsThomas Gleixner1-2/+2
and the random number of tabs. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210908132525.154428878@linutronix.de
2020-10-07x86/mce: Pass pointer to saved pt_regs to severity calculation routinesYouquan Song1-1/+2
New recovery features require additional information about processor state when a machine check occurred. Pass pt_regs down to the routines that need it. No functional change. Signed-off-by: Youquan Song <youquan.song@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20201006210910.21062-2-tony.luck@intel.com
2020-09-11x86/mce: Make mce_rdmsrl() panic on an inaccessible MSRBorislav Petkov1-0/+10
If an exception needs to be handled while reading an MSR - which is in most of the cases caused by a #GP on a non-existent MSR - then this is most likely the incarnation of a BIOS or a hardware bug. Such bug violates the architectural guarantee that MCA banks are present with all MSRs belonging to them. The proper fix belongs in the hardware/firmware - not in the kernel. Handling an #MC exception which is raised while an NMI is being handled would cause the nasty NMI nesting issue because of the shortcoming of IRET of reenabling NMIs when executed. And the machine is in an #MC context already so <Deity> be at its side. Tracing MSR accesses while in #MC is another no-no due to tracing being inherently a bad idea in atomic context: vmlinux.o: warning: objtool: do_machine_check()+0x4a: call to mce_rdmsrl() leaves .noinstr.text section so remove all that "additional" functionality from mce_rdmsrl() and provide it with a special exception handler which panics the machine when that MSR is not accessible. The exception handler prints a human-readable message explaining what the panic reason is but, what is more, it panics while in the #GP handler and latter won't have executed an IRET, thus opening the NMI nesting issue in the case when the #MC has happened while handling an NMI. (#MC itself won't be reenabled until MCG_STATUS hasn't been cleared). Suggested-by: Andy Lutomirski <luto@kernel.org> Suggested-by: Peter Zijlstra <peterz@infradead.org> [ Add missing prototypes for ex_handler_* ] Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20200906212130.GA28456@zn.tnic
2020-06-11Merge branch 'x86/entry' into ras/coreThomas Gleixner1-1/+1
to fixup conflicts in arch/x86/kernel/cpu/mce/core.c so MCE specific follow up patches can be applied without creating a horrible merge conflict afterwards.
2020-06-11x86/entry: Convert Machine Check to IDTENTRY_ISTThomas Gleixner1-1/+1
Convert #MC to IDTENTRY_MCE: - Implement the C entry points with DEFINE_IDTENTRY_MCE - Emit the ASM stub with DECLARE_IDTENTRY_MCE - Remove the ASM idtentry in 64bit - Remove the open coded ASM entry code in 32bit - Fixup the XEN/PV code - Remove the old prototypes - Remove the error code from *machine_check_vector() as it is always 0 and not used by any of the functions it can point to. Fixup all the functions as well. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Andy Lutomirski <luto@kernel.org> Link: https://lkml.kernel.org/r/20200505135314.334980426@linutronix.de
2020-04-14x86/mce: Add mce=print_all optionTony Luck1-0/+1
Sometimes, when logs are getting lost, it's nice to just have everything dumped to the serial console. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20200214222720.13168-7-tony.luck@intel.com
2020-04-14x86/mce/amd: Init thresholding machinery only on relevant vendorsThomas Gleixner1-3/+6
... and not unconditionally. [ bp: Add a new vendor_flags bit for that. ] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200403161943.1458-3-bp@alien8.de
2020-03-31Merge tag 'x86-entry-2020-03-30' of ↵Linus Torvalds1-0/+3
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 entry code updates from Thomas Gleixner: - Convert the 32bit syscalls to be pt_regs based which removes the requirement to push all 6 potential arguments onto the stack and consolidates the interface with the 64bit variant - The first small portion of the exception and syscall related entry code consolidation which aims to address the recently discovered issues vs. RCU, int3, NMI and some other exceptions which can interrupt any context. The bulk of the changes is still work in progress and aimed for 5.8. - A few lockdep namespace cleanups which have been applied into this branch to keep the prerequisites for the ongoing work confined. * tag 'x86-entry-2020-03-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (35 commits) x86/entry: Fix build error x86 with !CONFIG_POSIX_TIMERS lockdep: Rename trace_{hard,soft}{irq_context,irqs_enabled}() lockdep: Rename trace_softirqs_{on,off}() lockdep: Rename trace_hardirq_{enter,exit}() x86/entry: Rename ___preempt_schedule x86: Remove unneeded includes x86/entry: Drop asmlinkage from syscalls x86/entry/32: Enable pt_regs based syscalls x86/entry/32: Use IA32-specific wrappers for syscalls taking 64-bit arguments x86/entry/32: Rename 32-bit specific syscalls x86/entry/32: Clean up syscall_32.tbl x86/entry: Remove ABI prefixes from functions in syscall tables x86/entry/64: Add __SYSCALL_COMMON() x86/entry: Remove syscall qualifier support x86/entry/64: Remove ptregs qualifier from syscall table x86/entry: Move max syscall number calculation to syscallhdr.sh x86/entry/64: Split X32 syscall table into its own file x86/entry/64: Move sys_ni_syscall stub to common.c x86/entry/64: Use syscall wrappers for x32_rt_sigreturn x86/entry: Refactor SYS_NI macros ...
2020-02-27x86/entry/32: Force MCE through do_mce()Thomas Gleixner1-0/+3
Remove the pointless difference between 32 and 64 bit to make further unifications simpler. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Reviewed-by: Andy Lutomirski <luto@kernel.org> Link: https://lkml.kernel.org/r/20200225220216.428188397@linutronix.de
2020-02-19x86/mce: Do not log spurious corrected mce errorsPrarit Bhargava1-0/+2
A user has reported that they are seeing spurious corrected errors on their hardware. Intel Errata HSD131, HSM142, HSW131, and BDM48 report that "spurious corrected errors may be logged in the IA32_MC0_STATUS register with the valid field (bit 63) set, the uncorrected error field (bit 61) not set, a Model Specific Error Code (bits [31:16]) of 0x000F, and an MCA Error Code (bits [15:0]) of 0x0005." The Errata PDFs are linked in the bugzilla below. Block these spurious errors from the console and logs. [ bp: Move the intel_filter_mce() header declarations into the already existing CONFIG_X86_MCE_INTEL ifdeffery. ] Co-developed-by: Alexander Krupp <centos@akr.yagii.de> Signed-off-by: Alexander Krupp <centos@akr.yagii.de> Signed-off-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206587 Link: https://lkml.kernel.org/r/20200219131611.36816-1-prarit@redhat.com
2019-12-17x86/mce: Remove mce_inject_log() in favor of mce_log()Jan H. Schönherr1-2/+0
The mutex in mce_inject_log() became unnecessary with commit 5de97c9f6d85 ("x86/mce: Factor out and deprecate the /dev/mcelog driver"), though the original reason for its presence only vanished with commit 7298f08ea887 ("x86/mcelog: Get rid of RCU remnants"). Drop the mutex. And as that makes mce_inject_log() identical to mce_log(), get rid of the former in favor of the latter. Signed-off-by: Jan H. Schönherr <jschoenh@amazon.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20191210000733.17979-7-jschoenh@amazon.de
2019-10-01x86/mce: Add Zhaoxin LMCE supportTony W Wang-oc1-0/+4
Newer Zhaoxin CPUs support LMCE compatible with Intel. Add support for that. [ bp: Export functions and massage. ] Signed-off-by: Tony W Wang-oc <TonyWWang-oc@zhaoxin.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: CooperYan@zhaoxin.com Cc: DavidWang@zhaoxin.com Cc: HerryYang@zhaoxin.com Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: QiyuanWang@zhaoxin.com Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/1568787573-1297-5-git-send-email-TonyWWang-oc@zhaoxin.com
2019-10-01x86/mce: Add Zhaoxin CMCI supportTony W Wang-oc1-0/+2
All newer Zhaoxin CPUs support CMCI and are compatible with Intel's Machine-Check Architecture. Add that support for Zhaoxin CPUs. [ bp: Massage comments and export intel_init_cmci(). ] Signed-off-by: Tony W Wang-oc <TonyWWang-oc@zhaoxin.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: CooperYan@zhaoxin.com Cc: DavidWang@zhaoxin.com Cc: HerryYang@zhaoxin.com Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: QiyuanWang@zhaoxin.com Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/1568787573-1297-4-git-send-email-TonyWWang-oc@zhaoxin.com
2019-06-11x86/MCE: Make the number of MCA banks a per-CPU variableYazen Ghannam1-1/+1
The number of MCA banks is provided per logical CPU. Historically, this number has been the same across all CPUs, but this is not an architectural guarantee. Future AMD systems may have MCA bank counts that vary between logical CPUs in a system. This issue was partially addressed in 006c077041dc ("x86/mce: Handle varying MCA bank counts") by allocating structures using the maximum number of MCA banks and by saving the maximum MCA bank count in a system as the global count. This means that some extra structures are allocated. Also, this means that CPUs will spend more time in the #MC and other handlers checking extra MCA banks. Thus, define the number of MCA banks as a per-CPU variable. [ bp: Make mce_num_banks an unsigned int. ] Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: "x86@kernel.org" <x86@kernel.org> Link: https://lkml.kernel.org/r/20190607201752.221446-5-Yazen.Ghannam@amd.com
2019-06-11x86/MCE: Make struct mce_banks[] staticYazen Ghannam1-10/+0
The struct mce_banks[] array is only used in mce/core.c so move its definition there and make it static. Also, change the "init" field to bool type. Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: "x86@kernel.org" <x86@kernel.org> Link: https://lkml.kernel.org/r/20190607201752.221446-2-Yazen.Ghannam@amd.com
2019-04-23x86/MCE/AMD: Don't report L1 BTB MCA errors on some family 17h modelsYazen Ghannam1-0/+6
AMD family 17h Models 10h-2Fh may report a high number of L1 BTB MCA errors under certain conditions. The errors are benign and can safely be ignored. However, the high error rate may cause the MCA threshold counter to overflow causing a high rate of thresholding interrupts. In addition, users may see the errors reported through the AMD MCE decoder module, even with the interrupt disabled, due to MCA polling. Clear the "Counter Present" bit in the Instruction Fetch bank's MCA_MISC0 register. This will prevent enabling MCA thresholding on this bank which will prevent the high interrupt rate due to this error. Define an AMD-specific function to filter these errors from the MCE event pool so that they don't get reported during early boot. Rename filter function in EDAC/mce_amd to avoid a naming conflict, while at it. [ bp: Move function prototype to the internal header and massage/cleanup, fix typos. ] Reported-by: Rafał Miłecki <rafal@milecki.pl> Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "clemej@gmail.com" <clemej@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Morse <james.morse@arm.com> Cc: Kees Cook <keescook@chromium.org> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Pu Wen <puwen@hygon.cn> Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: Shirish S <Shirish.S@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Cc: <stable@vger.kernel.org> # 5.0.x: c95b323dcd35: x86/MCE/AMD: Turn off MC4_MISC thresholding on all family 0x15 models Cc: <stable@vger.kernel.org> # 5.0.x: 30aa3d26edb0: x86/MCE/AMD: Carve out the MC4_MISC thresholding quirk Cc: <stable@vger.kernel.org> # 5.0.x: 9308fd407455: x86/MCE: Group AMD function prototypes in <asm/mce.h> Cc: <stable@vger.kernel.org> # 5.0.x Link: https://lkml.kernel.org/r/20190325163410.171021-2-Yazen.Ghannam@amd.com
2019-04-23x86/MCE: Add an MCE-record filtering functionYazen Ghannam1-0/+3
Some systems may report spurious MCA errors. In general, spurious MCA errors may be disabled by clearing a particular bit in MCA_CTL. However, clearing a bit in MCA_CTL may not be recommended for some errors, so the only option is to ignore them. An MCA error is printed and handled after it has been added to the MCE event pool. So an MCA error can be ignored by not adding it to that pool in the first place. Add such a filtering function. [ bp: Move function prototype to the internal header and massage. ] Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: "clemej@gmail.com" <clemej@gmail.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Pu Wen <puwen@hygon.cn> Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: "rafal@milecki.pl" <rafal@milecki.pl> Cc: Shirish S <Shirish.S@amd.com> Cc: <stable@vger.kernel.org> # 5.0.x Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190325163410.171021-1-Yazen.Ghannam@amd.com
2018-12-06x86/mce: Unify pr_* prefixBorislav Petkov1-0/+3
Move the pr_fmt prefix to internal.h and include it everywhere. This way, all pr_* printed strings will be prepended with "mce: ". No functional changes. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20181205200913.GR29510@zn.tnic
2018-12-05x86/mce: Streamline MCE subsystem's namingBorislav Petkov1-0/+173
Rename the containing folder to "mce" which is the most widespread name. Drop the "mce[-_]" filename prefix of some compilation units (while others don't have it). This unifies the file naming in the MCE subsystem: mce/ |-- amd.c |-- apei.c |-- core.c |-- dev-mcelog.c |-- genpool.c |-- inject.c |-- intel.c |-- internal.h |-- Makefile |-- p5.c |-- severity.c |-- therm_throt.c |-- threshold.c `-- winchip.c No functional changes. Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Acked-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20181205141323.14995-1-bp@alien8.de