summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-06-12x86/pci: Skip early E820 check for ECAM regionBjorn Helgaas1-11/+29
commit 199f968f1484a14024d0d467211ffc2faf193eb4 upstream. Arul, Mateusz, Imcarneiro91, and Aman reported a regression caused by 07eab0901ede ("efi/x86: Remove EfiMemoryMappedIO from E820 map"). On the Lenovo Legion 9i laptop, that commit removes the ECAM area from E820, which means the early E820 validation fails, which means we don't enable ECAM in the "early MCFG" path. The static MCFG table describes ECAM without depending on the ACPI interpreter. Many Legion 9i ACPI methods rely on that, so they fail when PCI config access isn't available, resulting in the embedded controller, PS/2, audio, trackpad, and battery devices not being detected. The _OSC method also fails, so Linux can't take control of the PCIe hotplug, PME, and AER features: # pci_mmcfg_early_init() PCI: ECAM [mem 0xc0000000-0xce0fffff] (base 0xc0000000) for domain 0000 [bus 00-e0] PCI: not using ECAM ([mem 0xc0000000-0xce0fffff] not reserved) ACPI Error: AE_ERROR, Returned by Handler for [PCI_Config] (20230628/evregion-300) ACPI: Interpreter enabled ACPI: Ignoring error and continuing table load ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PC00.RP01._SB.PC00], AE_NOT_FOUND (20230628/dswload2-162) ACPI Error: AE_NOT_FOUND, During name lookup/catalog (20230628/psobject-220) ACPI: Skipping parse of AML opcode: OpcodeName unavailable (0x0010) ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PC00.RP01._SB.PC00], AE_NOT_FOUND (20230628/dswload2-162) ACPI Error: AE_NOT_FOUND, During name lookup/catalog (20230628/psobject-220) ... ACPI Error: Aborting method \_SB.PC00._OSC due to previous error (AE_NOT_FOUND) (20230628/psparse-529) acpi PNP0A08:00: _OSC: platform retains control of PCIe features (AE_NOT_FOUND) # pci_mmcfg_late_init() PCI: ECAM [mem 0xc0000000-0xce0fffff] (base 0xc0000000) for domain 0000 [bus 00-e0] PCI: [Firmware Info]: ECAM [mem 0xc0000000-0xce0fffff] not reserved in ACPI motherboard resources PCI: ECAM [mem 0xc0000000-0xce0fffff] is EfiMemoryMappedIO; assuming valid PCI: ECAM [mem 0xc0000000-0xce0fffff] reserved to work around lack of ACPI motherboard _CRS Per PCI Firmware r3.3, sec 4.1.2, ECAM space must be reserved by a PNP0C02 resource, but there's no requirement to mention it in E820, so we shouldn't look at E820 to validate the ECAM space described by MCFG. In 2006, 946f2ee5c731 ("[PATCH] i386/x86-64: Check that MCFG points to an e820 reserved area") added a sanity check of E820 to work around buggy MCFG tables, but that over-aggressive validation causes failures like this one. Keep the E820 validation check for machines older than 2016, an arbitrary ten years after 946f2ee5c731, so machines that depend on it don't break. Skip the early E820 check for 2016 and newer BIOSes since there's no requirement to describe ECAM in E820. Link: https://lore.kernel.org/r/20240417204012.215030-2-helgaas@kernel.org Fixes: 07eab0901ede ("efi/x86: Remove EfiMemoryMappedIO from E820 map") Reported-by: Mateusz Kaduk <mateusz.kaduk@gmail.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218444 Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Mateusz Kaduk <mateusz.kaduk@gmail.com> Reviewed-by: Andy Shevchenko <andy@kernel.org> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-12x86/topology: Handle bogus ACPI tables correctlyThomas Gleixner1-3/+50
commit 9d22c96316ac59ed38e80920c698fed38717b91b upstream. The ACPI specification clearly states how the processors should be enumerated in the MADT: "To ensure that the boot processor is supported post initialization, two guidelines should be followed. The first is that OSPM should initialize processors in the order that they appear in the MADT. The second is that platform firmware should list the boot processor as the first processor entry in the MADT. ... Failure of OSPM implementations and platform firmware to abide by these guidelines can result in both unpredictable and non optimal platform operation." The kernel relies on that ordering to detect the real BSP on crash kernels which is important to avoid sending a INIT IPI to it as that would cause a full machine reset. On a Dell XPS 16 9640 the BIOS ignores this rule and enumerates the CPUs in the wrong order. As a consequence the kernel falsely detects a crash kernel and disables the corresponding CPU. Prevent this by checking the IA32_APICBASE MSR for the BSP bit on the boot CPU. If that bit is set, then the MADT based BSP detection can be safely ignored. If the kernel detects a mismatch between the BSP bit and the first enumerated MADT entry then emit a firmware bug message. This obviously also has to be taken into account when the boot APIC ID and the first enumerated APIC ID match. If the boot CPU does not have the BSP bit set in the APICBASE MSR then there is no way for the boot CPU to determine which of the CPUs is the real BSP. Sending an INIT to the real BSP would reset the machine so the only sane way to deal with that is to limit the number of CPUs to one and emit a corresponding warning message. Fixes: 5c5682b9f87a ("x86/cpu: Detect real BSP on crash kernels") Reported-by: Carsten Tolkmit <ctolkmit@ennit.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Carsten Tolkmit <ctolkmit@ennit.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/87le48jycb.ffs@tglx Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218837 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-12efi: libstub: only free priv.runtime_map when allocatedHagar Hemdan1-2/+2
commit 4b2543f7e1e6b91cfc8dd1696e3cdf01c3ac8974 upstream. priv.runtime_map is only allocated when efi_novamap is not set. Otherwise, it is an uninitialized value. In the error path, it is freed unconditionally. Avoid passing an uninitialized value to free_pool. Free priv.runtime_map only when it was allocated. This bug was discovered and resolved using Coverity Static Analysis Security Testing (SAST) by Synopsys, Inc. Fixes: f80d26043af9 ("efi: libstub: avoid efi_get_memory_map() for allocating the virt map") Cc: <stable@vger.kernel.org> Signed-off-by: Hagar Hemdan <hagarhem@amazon.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-12x86/efistub: Omit physical KASLR when memory reservations existArd Biesheuvel1-2/+26
commit 15aa8fb852f995dd234a57f12dfb989044968bb6 upstream. The legacy decompressor has elaborate logic to ensure that the randomized physical placement of the decompressed kernel image does not conflict with any memory reservations, including ones specified on the command line using mem=, memmap=, efi_fake_mem= or hugepages=, which are taken into account by the kernel proper at a later stage. When booting in EFI mode, it is the firmware's job to ensure that the chosen range does not conflict with any memory reservations that it knows about, and this is trivially achieved by using the firmware's memory allocation APIs. That leaves reservations specified on the command line, though, which the firmware knows nothing about, as these regions have no other special significance to the platform. Since commit a1b87d54f4e4 ("x86/efistub: Avoid legacy decompressor when doing EFI boot") these reservations are not taken into account when randomizing the physical placement, which may result in conflicts where the memory cannot be reserved by the kernel proper because its own executable image resides there. To avoid having to duplicate or reuse the existing complicated logic, disable physical KASLR entirely when such overrides are specified. These are mostly diagnostic tools or niche features, and physical KASLR (as opposed to virtual KASLR, which is much more important as it affects the memory addresses observed by code executing in the kernel) is something we can live without. Closes: https://lkml.kernel.org/r/FA5F6719-8824-4B04-803E-82990E65E627%40akamai.com Reported-by: Ben Chaney <bchaney@akamai.com> Fixes: a1b87d54f4e4 ("x86/efistub: Avoid legacy decompressor when doing EFI boot") Cc: <stable@vger.kernel.org> # v6.1+ Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-12Revert "drm: Make drivers depends on DRM_DW_HDMI"Geert Uytterhoeven7-10/+9
commit 8f7f115596d3dccedc06f5813e0269734f5cc534 upstream. This reverts commit c0e0f139354c01e0213204e4a96e7076e5a3e396, as helper code should always be selected by the driver that needs it, for the convenience of the final user configuring a kernel. The user who configures a kernel should not need to know which helpers are needed for the driver he is interested in. Making a driver depend on helper code means that the user needs to know which helpers to enable first, which is very user-unfriendly. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Arnd Bergmann <arnd@arndb.de> Link: https://patchwork.freedesktop.org/patch/msgid/bd93d43b07f8ed6368119f4a5ddac2ee80debe53.1713780345.git.geert+renesas@glider.be Signed-off-by: Maxime Ripard <mripard@kernel.org> Cc: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-12ALSA: seq: ump: Fix swapped song position pointer dataTakashi Iwai1-3/+3
[ Upstream commit 310fa3ec2859f1c094e6e9b5d2e1ca51738c409a ] At converting between the legacy event and UMP, the parameters for MIDI Song Position Pointer are incorrectly stored. It should have been LSB -> MSB order while it stored in MSB -> LSB order. This patch corrects the ordering. Fixes: e9e02819a98a ("ALSA: seq: Automatic conversion of UMP events") Link: https://lore.kernel.org/r/20240531075110.3250-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12riscv: prevent pt_regs corruption for secondary idle threadsSergey Matyukevich2-3/+2
[ Upstream commit a638b0461b58aa3205cd9d5f14d6f703d795b4af ] Top of the kernel thread stack should be reserved for pt_regs. However this is not the case for the idle threads of the secondary boot harts. Their stacks overlap with their pt_regs, so both may get corrupted. Similar issue has been fixed for the primary hart, see c7cdd96eca28 ("riscv: prevent stack corruption by reserving task_pt_regs(p) early"). However that fix was not propagated to the secondary harts. The problem has been noticed in some CPU hotplug tests with V enabled. The function smp_callin stored several registers on stack, corrupting top of pt_regs structure including status field. As a result, kernel attempted to save or restore inexistent V context. Fixes: 9a2451f18663 ("RISC-V: Avoid using per cpu array for ordered booting") Fixes: 2875fe056156 ("RISC-V: Add cpu_ops and modify default booting method") Signed-off-by: Sergey Matyukevich <sergey.matyukevich@syntacore.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240523084327.2013211-1-geomatsi@gmail.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12hwmon: (shtc1) Fix property misspellingGuenter Roeck1-1/+1
[ Upstream commit 52a2c70c3ec555e670a34dd1ab958986451d2dd2 ] The property name is "sensirion,low-precision", not "sensicon,low-precision". Cc: Chris Ruehl <chris.ruehl@gtsys.com.hk> Fixes: be7373b60df5 ("hwmon: shtc1: add support for device tree bindings") Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12hwmon: (intel-m10-bmc-hwmon) Fix multiplier for N6000 board power sensorPeter Colberg1-1/+1
[ Upstream commit 027a44fedd55fbdf1d45603894634acd960ad04b ] The Intel N6000 BMC outputs the board power value in milliwatt, whereas the hwmon sysfs interface must provide power values in microwatt. Fixes: e1983220ae14 ("hwmon: intel-m10-bmc-hwmon: Add N6000 sensors") Signed-off-by: Peter Colberg <peter.colberg@intel.com> Reviewed-by: Matthew Gerlach <matthew.gerlach@linux.intel.com> Link: https://lore.kernel.org/r/20240521181246.683833-1-peter.colberg@intel.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12drm/panel: sitronix-st7789v: fix display size for jt240mhqs_hwt_ek_e3 panelGerald Loacker1-2/+2
[ Upstream commit b62c150c3bae72ac1910dcc588f360159eb0744a ] This is a portrait mode display. Change the dimensions accordingly. Fixes: 0fbbe96bfa08 ("drm/panel: sitronix-st7789v: add jasonic jt240mhqs-hwt-ek-e3 support") Signed-off-by: Gerald Loacker <gerald.loacker@wolfvision.net> Acked-by: Jessica Zhang <quic_jesszhan@quicinc.com> Link: https://lore.kernel.org/r/20240409-bugfix-jt240mhqs_hwt_ek_e3-timing-v2-3-e4821802443d@wolfvision.net Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240409-bugfix-jt240mhqs_hwt_ek_e3-timing-v2-3-e4821802443d@wolfvision.net Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12drm/panel: sitronix-st7789v: tweak timing for jt240mhqs_hwt_ek_e3 panelGerald Loacker1-3/+3
[ Upstream commit 2ba50582634d0bfe3a333ab7575a7f0122a7cde8 ] Use the default timing parameters to get a refresh rate of about 60 Hz for a clock of 6 MHz. Fixes: 0fbbe96bfa08 ("drm/panel: sitronix-st7789v: add jasonic jt240mhqs-hwt-ek-e3 support") Signed-off-by: Gerald Loacker <gerald.loacker@wolfvision.net> Acked-by: Jessica Zhang <quic_jesszhan@quicinc.com> Link: https://lore.kernel.org/r/20240409-bugfix-jt240mhqs_hwt_ek_e3-timing-v2-2-e4821802443d@wolfvision.net Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240409-bugfix-jt240mhqs_hwt_ek_e3-timing-v2-2-e4821802443d@wolfvision.net Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12drm/panel: sitronix-st7789v: fix timing for jt240mhqs_hwt_ek_e3 panelGerald Loacker1-3/+3
[ Upstream commit 0e5895ff7fab0fc05ec17daf9a568368828fa6ea ] Flickering was observed when using partial mode. Moving the vsync to the same position as used by the default sitronix-st7789v timing resolves this issue. Fixes: 0fbbe96bfa08 ("drm/panel: sitronix-st7789v: add jasonic jt240mhqs-hwt-ek-e3 support") Acked-by: Jessica Zhang <quic_jesszhan@quicinc.com> Signed-off-by: Gerald Loacker <gerald.loacker@wolfvision.net> Link: https://lore.kernel.org/r/20240409-bugfix-jt240mhqs_hwt_ek_e3-timing-v2-1-e4821802443d@wolfvision.net Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240409-bugfix-jt240mhqs_hwt_ek_e3-timing-v2-1-e4821802443d@wolfvision.net Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12powerpc/pseries/lparcfg: drop error message from guest name lookupNathan Lynch1-2/+2
[ Upstream commit 12870ae3818e39ea65bf710f645972277b634f72 ] It's not an error or exceptional situation when the hosting environment does not expose a name for the LP/guest via RTAS or the device tree. This happens with qemu when run without the '-name' option. The message also lacks a newline. Remove it. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Fixes: eddaa9a40275 ("powerpc/pseries: read the lpar name from the firmware") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240524-lparcfg-updates-v2-1-62e2e9d28724@linux.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12ALSA: seq: Fix yet another spot for system message conversionTakashi Iwai1-0/+1
[ Upstream commit 700fe6fd093d08c6da2bda8efe00479b0e617327 ] We fixed the incorrect UMP type for system messages in the recent commit, but it missed one place in system_ev_to_ump_midi1(). Fix it now. Fixes: e9e02819a98a ("ALSA: seq: Automatic conversion of UMP events") Fixes: c2bb79613fed ("ALSA: seq: Fix incorrect UMP type for system messages") Link: https://lore.kernel.org/r/20240530101044.17524-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12ipvlan: Dont Use skb->sk in ipvlan_process_v{4,6}_outboundYue Haibing1-2/+2
[ Upstream commit b3dc6e8003b500861fa307e9a3400c52e78e4d3a ] Raw packet from PF_PACKET socket ontop of an IPv6-backed ipvlan device will hit WARN_ON_ONCE() in sk_mc_loop() through sch_direct_xmit() path. WARNING: CPU: 2 PID: 0 at net/core/sock.c:775 sk_mc_loop+0x2d/0x70 Modules linked in: sch_netem ipvlan rfkill cirrus drm_shmem_helper sg drm_kms_helper CPU: 2 PID: 0 Comm: swapper/2 Kdump: loaded Not tainted 6.9.0+ #279 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 RIP: 0010:sk_mc_loop+0x2d/0x70 Code: fa 0f 1f 44 00 00 65 0f b7 15 f7 96 a3 4f 31 c0 66 85 d2 75 26 48 85 ff 74 1c RSP: 0018:ffffa9584015cd78 EFLAGS: 00010212 RAX: 0000000000000011 RBX: ffff91e585793e00 RCX: 0000000002c6a001 RDX: 0000000000000000 RSI: 0000000000000040 RDI: ffff91e589c0f000 RBP: ffff91e5855bd100 R08: 0000000000000000 R09: 3d00545216f43d00 R10: ffff91e584fdcc50 R11: 00000060dd8616f4 R12: ffff91e58132d000 R13: ffff91e584fdcc68 R14: ffff91e5869ce800 R15: ffff91e589c0f000 FS: 0000000000000000(0000) GS:ffff91e898100000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f788f7c44c0 CR3: 0000000008e1a000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <IRQ> ? __warn (kernel/panic.c:693) ? sk_mc_loop (net/core/sock.c:760) ? report_bug (lib/bug.c:201 lib/bug.c:219) ? handle_bug (arch/x86/kernel/traps.c:239) ? exc_invalid_op (arch/x86/kernel/traps.c:260 (discriminator 1)) ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:621) ? sk_mc_loop (net/core/sock.c:760) ip6_finish_output2 (net/ipv6/ip6_output.c:83 (discriminator 1)) ? nf_hook_slow (net/netfilter/core.c:626) ip6_finish_output (net/ipv6/ip6_output.c:222) ? __pfx_ip6_finish_output (net/ipv6/ip6_output.c:215) ipvlan_xmit_mode_l3 (drivers/net/ipvlan/ipvlan_core.c:602) ipvlan ipvlan_start_xmit (drivers/net/ipvlan/ipvlan_main.c:226) ipvlan dev_hard_start_xmit (net/core/dev.c:3594) sch_direct_xmit (net/sched/sch_generic.c:343) __qdisc_run (net/sched/sch_generic.c:416) net_tx_action (net/core/dev.c:5286) handle_softirqs (kernel/softirq.c:555) __irq_exit_rcu (kernel/softirq.c:589) sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1043) The warning triggers as this: packet_sendmsg packet_snd //skb->sk is packet sk __dev_queue_xmit __dev_xmit_skb //q->enqueue is not NULL __qdisc_run sch_direct_xmit dev_hard_start_xmit ipvlan_start_xmit ipvlan_xmit_mode_l3 //l3 mode ipvlan_process_outbound //vepa flag ipvlan_process_v6_outbound ip6_local_out __ip6_finish_output ip6_finish_output2 //multicast packet sk_mc_loop //sk->sk_family is AF_PACKET Call ip{6}_local_out() with NULL sk in ipvlan as other tunnels to fix this. Fixes: 2ad7bf363841 ("ipvlan: Initial check-in of the IPVLAN driver.") Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20240529095633.613103-1-yuehaibing@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12net: ena: Fix redundant device NUMA node overrideShay Agroskin1-11/+0
[ Upstream commit 2dc8b1e7177d4f49f492ce648440caf2de0c3616 ] The driver overrides the NUMA node id of the device regardless of whether it knows its correct value (often setting it to -1 even though the node id is advertised in 'struct device'). This can lead to suboptimal configurations. This patch fixes this behavior and makes the shared memory allocation functions use the NUMA node id advertised by the underlying device. Fixes: 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)") Signed-off-by: Shay Agroskin <shayagr@amazon.com> Link: https://lore.kernel.org/r/20240528170912.1204417-1-shayagr@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12ice: fix 200G PHY types to link speed mappingPaul Greenwalt1-0/+10
[ Upstream commit 2a6d8f2de2224ac46df94dc40f43f8b9701f6703 ] Commit 24407a01e57c ("ice: Add 200G speed/phy type use") added support for 200G PHY speeds, but did not include the mapping of 200G PHY types to link speed. As a result the driver is returning UNKNOWN link speed when setting 200G ethtool advertised link modes. To fix this add 200G PHY types to link speed mapping to ice_get_link_speed_based_on_phy_type(). Fixes: 24407a01e57c ("ice: Add 200G speed/phy type use") Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20240528-net-2024-05-28-intel-net-fixes-v1-5-dc8593d2bbc6@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12net: dsa: microchip: fix RGMII error in KSZ DSA driverTristram Ha1-1/+1
[ Upstream commit 278d65ccdadb5f0fa0ceaf7b9cc97b305cd72822 ] The driver should return RMII interface when XMII is running in RMII mode. Fixes: 0ab7f6bf1675 ("net: dsa: microchip: ksz9477: use common xmii function") Signed-off-by: Tristram Ha <tristram.ha@microchip.com> Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com> Acked-by: Jerry Ray <jerry.ray@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/1716932066-3342-1-git-send-email-Tristram.Ha@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12ipv4: correctly iterate over the target netns in inet_dump_ifaddr()Alexander Mikhalitsyn1-1/+1
[ Upstream commit b8c8abefc07b47f0dc9342530b7618237df96724 ] A recent change to inet_dump_ifaddr had the function incorrectly iterate over net rather than tgt_net, resulting in the data coming for the incorrect network namespace. Fixes: cdb2f80f1c10 ("inet: use xa_array iterator to implement inet_dump_ifaddr()") Reported-by: Stéphane Graber <stgraber@stgraber.org> Closes: https://github.com/lxc/incus/issues/892 Bisected-by: Stéphane Graber <stgraber@stgraber.org> Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com> Tested-by: Stéphane Graber <stgraber@stgraber.org> Acked-by: Christian Brauner <brauner@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20240528203030.10839-1-aleksandr.mikhalitsyn@canonical.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12net: fix __dst_negative_advice() raceEric Dumazet5-47/+30
[ Upstream commit 92f1655aa2b2294d0b49925f3b875a634bd3b59e ] __dst_negative_advice() does not enforce proper RCU rules when sk->dst_cache must be cleared, leading to possible UAF. RCU rules are that we must first clear sk->sk_dst_cache, then call dst_release(old_dst). Note that sk_dst_reset(sk) is implementing this protocol correctly, while __dst_negative_advice() uses the wrong order. Given that ip6_negative_advice() has special logic against RTF_CACHE, this means each of the three ->negative_advice() existing methods must perform the sk_dst_reset() themselves. Note the check against NULL dst is centralized in __dst_negative_advice(), there is no need to duplicate it in various callbacks. Many thanks to Clement Lecigne for tracking this issue. This old bug became visible after the blamed commit, using UDP sockets. Fixes: a87cb3e48ee8 ("net: Facility to report route quality of connected sockets") Reported-by: Clement Lecigne <clecigne@google.com> Diagnosed-by: Clement Lecigne <clecigne@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tom Herbert <tom@herbertland.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20240528114353.1794151-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12inet: introduce dst_rtable() helperEric Dumazet23-70/+64
[ Upstream commit 05d6d492097c55f2d153fc3fd33cbe78e1e28e0a ] I added dst_rt6_info() in commit e8dfd42c17fa ("ipv6: introduce dst_rt6_info() helper") This patch does a similar change for IPv4. Instead of (struct rtable *)dst casts, we can use : #define dst_rtable(_ptr) \ container_of_const(_ptr, struct rtable, dst) Patch is smaller than IPv6 one, because IPv4 has skb_rtable() helper. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://lore.kernel.org/r/20240429133009.1227754-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Stable-dep-of: 92f1655aa2b2 ("net: fix __dst_negative_advice() race") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12ipv6: introduce dst_rt6_info() helperEric Dumazet30-86/+77
[ Upstream commit e8dfd42c17faf183415323db1ef0c977be0d6489 ] Instead of (struct rt6_info *)dst casts, we can use : #define dst_rt6_info(_ptr) \ container_of_const(_ptr, struct rt6_info, dst) Some places needed missing const qualifiers : ip6_confirm_neigh(), ipv6_anycast_destination(), ipv6_unicast_destination(), has_gateway() v2: added missing parts (David Ahern) Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Stable-dep-of: 92f1655aa2b2 ("net: fix __dst_negative_advice() race") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12drm/amdgpu: Adjust logic in amdgpu_device_partner_bandwidth()Alex Deucher1-7/+12
[ Upstream commit ba46b3bda296c4f82b061ac40b90f49d2a00a380 ] Use current speed/width on devices which don't support dynamic PCIe switching. Fixes: 466a7d115326 ("drm/amd: Use the first non-dGPU PCI device for BW limits") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3289 Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12spi: stm32: Don't warn about spurious interruptsUwe Kleine-König1-1/+1
[ Upstream commit 95d7c452a26564ef0c427f2806761b857106d8c4 ] The dev_warn to notify about a spurious interrupt was introduced with the reasoning that these are unexpected. However spurious interrupts tend to trigger continously and the error message on the serial console prevents that the core's detection of spurious interrupts kicks in (which disables the irq) and just floods the console. Fixes: c64e7efe46b7 ("spi: stm32: make spurious and overrun interrupts visible") Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://msgid.link/r/20240521105241.62400-2-u.kleine-koenig@pengutronix.de Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12kheaders: use `command -v` to test for existence of `cpio`Miguel Ojeda1-1/+6
[ Upstream commit 6e58e0173507e506a5627741358bc770f220e356 ] Commit 13e1df09284d ("kheaders: explicitly validate existence of cpio command") added an explicit check for `cpio` using `type`. However, `type` in `dash` (which is used in some popular distributions and base images as the shell script runner) prints the missing message to standard output, and thus no error is printed: $ bash -c 'type missing >/dev/null' bash: line 1: type: missing: not found $ dash -c 'type missing >/dev/null' $ For instance, this issue may be seen by loongarch builders, given its defconfig enables CONFIG_IKHEADERS since commit 9cc1df421f00 ("LoongArch: Update Loongson-3 default config file"). Therefore, use `command -v` instead to have consistent behavior, and take the chance to provide a more explicit error. Fixes: 13e1df09284d ("kheaders: explicitly validate existence of cpio command") Signed-off-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12drm/i915/gt: Fix CCS id's calculation for CCS mode settingAndi Shyti3-1/+15
[ Upstream commit ee01b6a386eaf9984b58a2476e8f531149679da9 ] The whole point of the previous fixes has been to change the CCS hardware configuration to generate only one stream available to the compute users. We did this by changing the info.engine_mask that is set during device probe, reset during the detection of the fused engines, and finally reset again when choosing the CCS mode. We can't use the engine_mask variable anymore, as with the current configuration, it imposes only one CCS no matter what the hardware configuration is. Before changing the engine_mask for the third time, save it and use it for calculating the CCS mode. After the previous changes, the user reported a performance drop to around 1/4. We have tested that the compute operations, with the current patch, have improved by the same factor. Fixes: 6db31251bb26 ("drm/i915/gt: Enable only one CCS for compute workload") Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Cc: Chris Wilson <chris.p.wilson@linux.intel.com> Cc: Gnattu OC <gnattuoc@me.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Tested-by: Jian Ye <jian.ye@intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Tested-by: Gnattu OC <gnattuoc@me.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240517090616.242529-1-andi.shyti@linux.intel.com (cherry picked from commit a09d2327a9ba8e3f5be238bc1b7ca2809255b464) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12drm/i915/guc: avoid FIELD_PREP warningArnd Bergmann1-3/+3
[ Upstream commit d4f36db62396b73bed383c0b6e48d36278cafa78 ] With gcc-7 and earlier, there are lots of warnings like In file included from <command-line>:0:0: In function '__guc_context_policy_add_priority.isra.66', inlined from '__guc_context_set_prio.isra.67' at drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c:3292:3, inlined from 'guc_context_set_prio' at drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c:3320:2: include/linux/compiler_types.h:399:38: error: call to '__compiletime_assert_631' declared with attribute error: FIELD_PREP: mask is not constant _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) ^ ... drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c:2422:3: note: in expansion of macro 'FIELD_PREP' FIELD_PREP(GUC_KLV_0_KEY, GUC_CONTEXT_POLICIES_KLV_ID_##id) | \ ^~~~~~~~~~ Make sure that GUC_KLV_0_KEY is an unsigned value to avoid the warning. Fixes: 77b6f79df66e ("drm/i915/guc: Update to GuC version 69.0.3") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Julia Filipchuk <julia.filipchuk@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240430164809.482131-1-julia.filipchuk@intel.com (cherry picked from commit 364e039827ef628c650c21c1afe1c54d9c3296d9) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12kconfig: fix comparison to constant symbols, 'm', 'n'Masahiro Yamada1-2/+4
[ Upstream commit aabdc960a283ba78086b0bf66ee74326f49e218e ] Currently, comparisons to 'm' or 'n' result in incorrect output. [Test Code] config MODULES def_bool y modules config A def_tristate m config B def_bool A > n CONFIG_B is unset, while CONFIG_B=y is expected. The reason for the issue is because Kconfig compares the tristate values as strings. Currently, the .type fields in the constant symbol definitions, symbol_{yes,mod,no} are unspecified, i.e., S_UNKNOWN. When expr_calc_value() evaluates 'A > n', it checks the types of 'A' and 'n' to determine how to compare them. The left-hand side, 'A', is a tristate symbol with a value of 'm', which corresponds to a numeric value of 1. (Internally, 'y', 'm', and 'n' are represented as 2, 1, and 0, respectively.) The right-hand side, 'n', has an unknown type, so it is treated as the string "n" during the comparison. expr_calc_value() compares two values numerically only when both can have numeric values. Otherwise, they are compared as strings. symbol numeric value ASCII code ------------------------------------- y 2 0x79 m 1 0x6d n 0 0x6e 'm' is greater than 'n' if compared numerically (since 1 is greater than 0), but smaller than 'n' if compared as strings (since the ASCII code 0x6d is smaller than 0x6e). Specifying .type=S_TRISTATE for symbol_{yes,mod,no} fixes the above test code. Doing so, however, would cause a regression to the following test code. [Test Code 2] config MODULES def_bool n modules config A def_tristate n config B def_bool A = m You would get CONFIG_B=y, while CONFIG_B should not be set. The reason is because sym_get_string_value() turns 'm' into 'n' when the module feature is disabled. Consequently, expr_calc_value() evaluates 'A = n' instead of 'A = m'. This oddity has been hidden because the type of 'm' was previously S_UNKNOWN instead of S_TRISTATE. sym_get_string_value() should not tweak the string because the tristate value has already been correctly calculated. There is no reason to return the string "n" where its tristate value is mod. Fixes: 31847b67bec0 ("kconfig: allow use of relations other than (in)equality") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12net/sched: taprio: extend minimum interval restriction to entire cycle tooVladimir Oltean2-5/+27
[ Upstream commit fb66df20a7201e60f2b13d7f95d031b31a8831d3 ] It is possible for syzbot to side-step the restriction imposed by the blamed commit in the Fixes: tag, because the taprio UAPI permits a cycle-time different from (and potentially shorter than) the sum of entry intervals. We need one more restriction, which is that the cycle time itself must be larger than N * ETH_ZLEN bit times, where N is the number of schedule entries. This restriction needs to apply regardless of whether the cycle time came from the user or was the implicit, auto-calculated value, so we move the existing "cycle == 0" check outside the "if "(!new->cycle_time)" branch. This way covers both conditions and scenarios. Add a selftest which illustrates the issue triggered by syzbot. Fixes: b5b73b26b3ca ("taprio: Fix allowing too small intervals") Reported-by: syzbot+a7d2b1d5d1af83035567@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/0000000000007d66bc06196e7c66@google.com/ Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20240527153955.553333-2-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12net/sched: taprio: make q->picos_per_byte available to fill_sched_entry()Vladimir Oltean2-1/+25
[ Upstream commit e634134180885574d1fe7aa162777ba41e7fcd5b ] In commit b5b73b26b3ca ("taprio: Fix allowing too small intervals"), a comparison of user input against length_to_duration(q, ETH_ZLEN) was introduced, to avoid RCU stalls due to frequent hrtimers. The implementation of length_to_duration() depends on q->picos_per_byte being set for the link speed. The blamed commit in the Fixes: tag has moved this too late, so the checks introduced above are ineffective. The q->picos_per_byte is zero at parse_taprio_schedule() -> parse_sched_list() -> parse_sched_entry() -> fill_sched_entry() time. Move the taprio_set_picos_per_byte() call as one of the first things in taprio_change(), before the bulk of the netlink attribute parsing is done. That's because it is needed there. Add a selftest to make sure the issue doesn't get reintroduced. Fixes: 09dbdf28f9f9 ("net/sched: taprio: fix calculation of maximum gate durations") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20240527153955.553333-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12netfilter: nft_fib: allow from forward/input without iif selectorEric Garver1-5/+3
[ Upstream commit e8ded22ef0f4831279c363c264cd41cd9d59ca9e ] This removes the restriction of needing iif selector in the forward/input hooks for fib lookups when requested result is oif/oifname. Removing this restriction allows "loose" lookups from the forward hooks. Fixes: be8be04e5ddb ("netfilter: nft_fib: reverse path filter for policy-based routing on iif") Signed-off-by: Eric Garver <eric@garver.life> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12netfilter: tproxy: bail out if IP has been disabled on the deviceFlorian Westphal1-0/+2
[ Upstream commit 21a673bddc8fd4873c370caf9ae70ffc6d47e8d3 ] syzbot reports: general protection fault, probably for non-canonical address 0xdffffc0000000003: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000018-0x000000000000001f] [..] RIP: 0010:nf_tproxy_laddr4+0xb7/0x340 net/ipv4/netfilter/nf_tproxy_ipv4.c:62 Call Trace: nft_tproxy_eval_v4 net/netfilter/nft_tproxy.c:56 [inline] nft_tproxy_eval+0xa9a/0x1a00 net/netfilter/nft_tproxy.c:168 __in_dev_get_rcu() can return NULL, so check for this. Reported-and-tested-by: syzbot+b94a6818504ea90d7661@syzkaller.appspotmail.com Fixes: cc6eb4338569 ("tproxy: use the interface primary IP address as a default value for --on-ip") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12netfilter: nft_payload: skbuff vlan metadata mangle supportPablo Neira Ayuso1-7/+65
[ Upstream commit 33c563ebf8d3deed7d8addd20d77398ac737ef9a ] Userspace assumes vlan header is present at a given offset, but vlan offload allows to store this in metadata fields of the skbuff. Hence mangling vlan results in a garbled packet. Handle this transparently by adding a parser to the kernel. If vlan metadata is present and payload offset is over 12 bytes (source and destination mac address fields), then subtract vlan header present in vlan metadata, otherwise mangle vlan metadata based on offset and length, extracting data from the source register. This is similar to: 8cfd23e67401 ("netfilter: nft_payload: work around vlan header stripping") to deal with vlan payload mangling. Fixes: 7ec3f7b47b8d ("netfilter: nft_payload: add packet mangling support") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12net: ti: icssg-prueth: Fix start counter for ft1 filterMD Danish Anwar1-1/+1
[ Upstream commit 56a5cf538c3f2d935b0d81040a8303b6e7fc5fd8 ] The start counter for FT1 filter is wrongly set to 0 in the driver. FT1 is used for source address violation (SAV) check and source address starts at Byte 6 not Byte 0. Fix this by changing start counter to ETH_ALEN in icssg_ft1_set_mac_addr(). Fixes: e9b4ece7d74b ("net: ti: icssg-prueth: Add Firmware config and classification APIs.") Signed-off-by: MD Danish Anwar <danishanwar@ti.com> Link: https://lore.kernel.org/r/20240527063015.263748-1-danishanwar@ti.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12block: stack max_user_sectorsChristoph Hellwig1-0/+2
[ Upstream commit e528bede6f4e6822afdf0fa80be46ea9199f0911 ] The max_user_sectors is one of the three factors determining the actual max_sectors limit for READ/WRITE requests. Because of that it needs to be stacked at least for the device mapper multi-path case where requests are directly inserted on the lower device. For SCSI disks this is important because the sd driver actually sets it's own advisory limit that is lower than max_hw_sectors based on the block limits VPD page. While this is a bit odd an unusual, the same effect can happen if a user or udev script tweaks the value manually. Fixes: 4f563a64732d ("block: add a max_user_discard_sectors queue limit") Reported-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Mike Snitzer <snitzer@kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20240523182618.602003-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12sd: also set max_user_sectors when setting max_sectorsChristoph Hellwig1-1/+3
[ Upstream commit bafea1c58b24be594d97841ced1b7ae0347bf6e3 ] sd can set a max_sectors value that is lower than the max_hw_sectors limit based on the block limits VPD page. While this is rather unusual, it used to work until the max_user_sectors field was split out to cleanly deal with conflicting hardware and user limits when the hardware limit changes. Also set max_user_sectors to ensure the limit can properly be stacked. Fixes: 4f563a64732d ("block: add a max_user_discard_sectors queue limit") Reported-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Mike Snitzer <snitzer@kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20240523182618.602003-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12ALSA: seq: Don't clear bank selection at event -> UMP MIDI2 conversionTakashi Iwai1-1/+0
[ Upstream commit a200df7deb3186cd7b55abb77ab96dfefb8a4f09 ] The current code to convert from a legacy sequencer event to UMP MIDI2 clears the bank selection at each time the program change is submitted. This is confusing and may lead to incorrect bank values tranmitted to the destination in the end. Drop the line to clear the bank info and keep the provided values. Fixes: e9e02819a98a ("ALSA: seq: Automatic conversion of UMP events") Link: https://lore.kernel.org/r/20240527151852.29036-2-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12ALSA: seq: Fix missing bank setup between MIDI1/MIDI2 UMP conversionTakashi Iwai1-0/+38
[ Upstream commit 8a42886cae307663f3f999846926bd6e64392000 ] When a UMP packet is converted between MIDI1 and MIDI2 protocols, the bank selection may be lost. The conversion from MIDI1 to MIDI2 needs the encoding of the bank into UMP_MSG_STATUS_PROGRAM bits, while the conversion from MIDI2 to MIDI1 needs the extraction from that instead. This patch implements the missing bank selection mechanism in those conversions. Fixes: e9e02819a98a ("ALSA: seq: Automatic conversion of UMP events") Link: https://lore.kernel.org/r/20240527151852.29036-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12drm/xe: Only use reserved BCS instances for usm migrate exec queueMatthew Brost1-7/+5
[ Upstream commit c8ea2c31f5ea437199b239d76ad5db27343edb0c ] The GuC context scheduling queue is 2 entires deep, thus it is possible for a migration job to be stuck behind a fault if migration exec queue shares engines with user jobs. This can deadlock as the migrate exec queue is required to service page faults. Avoid deadlock by only using reserved BCS instances for usm migrate exec queue. Fixes: a043fbab7af5 ("drm/xe/pvc: Use fast copy engines as migrate engine on PVC") Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240415190453.696553-2-matthew.brost@intel.com Reviewed-by: Brian Welty <brian.welty@intel.com> (cherry picked from commit 04f4a70a183a688a60fe3882d6e4236ea02cfc67) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12drm/xe: Change pcode timeout to 50msec while polling againHimal Prasad Ghimiray1-1/+1
[ Upstream commit 77b79df0268bee3ef38fd5e76e86a076ce02995d ] Polling is initially attempted with timeout_base_ms enabled for preemption, and if it exceeds this timeframe, another attempt is made without preemption, allowing an additional 50 ms before timing out. v2 - Rebase v3 - Move warnings to separate patch (Lucas) Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Fixes: 7dc9b92dcfef ("drm/xe: Remove i915_utils dependency from xe_pcode.") Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240508152216.3263109-2-himal.prasad.ghimiray@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit c81858eb52266b3d6ba28ca4f62a198231a10cdc) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12drm/xe: check pcode init status only on root gt of root tileRiana Tauro4-64/+94
[ Upstream commit 933fd5ffaf87a60a019992d48e3a96b5c3403d9f ] The root tile indicates the pcode initialization is complete when all tiles have completed their initialization. So the mailbox can be polled only on the root tile. Check pcode init status only on root tile and move it to device probe early as root tile is initialized there. Also make similar changes in resume paths. v2: add lock/unlocked version of pcode_mailbox_rw to allow pcode init to be called in device early probe (Rodrigo) v3: add code description about using root tile change function names to xe_pcode_probe_early and xe_pcode_init (Rodrigo) Signed-off-by: Riana Tauro <riana.tauro@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240410085005.1126343-2-riana.tauro@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Stable-dep-of: 77b79df0268b ("drm/xe: Change pcode timeout to 50msec while polling again") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12drm/xe: Add dbg messages on the suspend resume functions.Rodrigo Vivi1-5/+17
[ Upstream commit f7f24b7950af4b1548ad5075ddb13eeb333bb782 ] In case of the suspend/resume flow getting locked up we can get reports with some useful hints on where it might get locked and if that has failed. Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240318180141.267458-2-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Stable-dep-of: 77b79df0268b ("drm/xe: Change pcode timeout to 50msec while polling again") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12selftests: mptcp: join: mark 'fail' tests as flakyMatthieu Baerts (NGI0)1-0/+2
[ Upstream commit 38af56e6668b455f7dd0a8e2d9afe74100068e17 ] These tests are rarely unstable. It depends on the CI running the tests, especially if it is also busy doing other tasks in parallel, and if a debug kernel config is being used. It looks like this issue is sometimes present with the NetDev CI. While this is being investigated, the tests are marked as flaky not to create noises on such CIs. Fixes: b6e074e171bc ("selftests: mptcp: add infinite map testcase") Link: https://github.com/multipath-tcp/mptcp_net-next/issues/491 Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://lore.kernel.org/r/20240524-upstream-net-20240524-selftests-mptcp-flaky-v1-4-a352362f3f8e@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12selftests: mptcp: add ms units for tc-netem delayGeliang Tang2-5/+5
[ Upstream commit 9109853a388b7b2b934f56f4ddb250d72e486555 ] 'delay 1' in tc-netem is confusing, not sure if it's a delay of 1 second or 1 millisecond. This patch explicitly adds millisecond units to make these commands clearer. Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Stable-dep-of: 38af56e6668b ("selftests: mptcp: join: mark 'fail' tests as flaky") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12selftests: mptcp: join: mark 'fastclose' tests as flakyMatthieu Baerts (NGI0)1-1/+7
[ Upstream commit 8c06ac2178a9dee887929232226e35a5cdda1793 ] These tests are flaky since their introduction. This might be less or not visible depending on the CI running the tests, especially if it is also busy doing other tasks in parallel, and if a debug kernel config is being used. It looks like this issue is often present with the NetDev CI. While this is being investigated, the tests are marked as flaky not to create noises on such CIs. Fixes: 01542c9bf9ab ("selftests: mptcp: add fastclose testcase") Link: https://github.com/multipath-tcp/mptcp_net-next/issues/324 Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://lore.kernel.org/r/20240524-upstream-net-20240524-selftests-mptcp-flaky-v1-3-a352362f3f8e@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12selftests: mptcp: simult flows: mark 'unbalanced' tests as flakyMatthieu Baerts (NGI0)1-3/+3
[ Upstream commit cc73a6577ae64247898269d138dee6b73ff710cc ] These tests are flaky since their introduction. This might be less or not visible depending on the CI running the tests, especially if it is also busy doing other tasks in parallel. A first analysis shown that the transfer can be slowed down when there are some re-injections at the MPTCP level. Such re-injections can of course happen, and disturb the transfer, but it looks strange to have them in this lab. That could be caused by the kernel having access to less CPU cycles -- e.g. when other activities are executed in parallel -- or by a misinterpretation on the MPTCP packet scheduler side. While this is being investigated, the tests are marked as flaky not to create noises in other CIs. Fixes: 219d04992b68 ("mptcp: push pending frames when subflow has free space") Link: https://github.com/multipath-tcp/mptcp_net-next/issues/475 Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://lore.kernel.org/r/20240524-upstream-net-20240524-selftests-mptcp-flaky-v1-2-a352362f3f8e@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12ice: fix accounting if a VLAN already existsJacob Keller1-5/+6
[ Upstream commit 82617b9a04649e83ee8731918aeadbb6e6d7cbc7 ] The ice_vsi_add_vlan() function is used to add a VLAN filter for the target VSI. This function prepares a filter in the switch table for the given VSI. If it succeeds, the vsi->num_vlan counter is incremented. It is not considered an error to add a VLAN which already exists in the switch table, so the function explicitly checks and ignores -EEXIST. The vsi->num_vlan counter is still incremented. This seems incorrect, as it means we can double-count in the case where the same VLAN is added twice by the caller. The actual table will have one less filter than the count. The ice_vsi_del_vlan() function similarly checks and handles the -ENOENT condition for when deleting a filter that doesn't exist. This flow only decrements the vsi->num_vlan if it actually deleted a filter. The vsi->num_vlan counter is used only in a few places, primarily related to tracking the number of non-zero VLANs. If the vsi->num_vlans gets out of sync, then ice_vsi_num_non_zero_vlans() will incorrectly report more VLANs than are present, and ice_vsi_has_non_zero_vlans() could return true potentially in cases where there are only VLAN 0 filters left. Fix this by only incrementing the vsi->num_vlan in the case where we actually added an entry, and not in the case where the entry already existed. Fixes: a1ffafb0b4a4 ("ice: Support configuring the device to Double VLAN Mode") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240523-net-2024-05-23-intel-net-fixes-v1-2-17a923e0bb5f@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12idpf: don't enable NAPI and interrupts prior to allocating Rx buffersAlexander Lobakin3-5/+9
[ Upstream commit d514c8b54209de7a95ab37259fe32c7406976bd9 ] Currently, idpf enables NAPI and interrupts prior to allocating Rx buffers. This may lead to frame loss (there are no buffers to place incoming frames) and even crashes on quick ifup-ifdown. Interrupts must be enabled only after all the resources are here and available. Split interrupt init into two phases: initialization and enabling, and perform the second only after the queues are fully initialized. Note that we can't just move interrupt initialization down the init process, as the queues must have correct a ::q_vector pointer set and NAPI already added in order to allocate buffers correctly. Also, during the deinit process, disable HW interrupts first and only then disable NAPI. Otherwise, there can be a HW event leading to napi_schedule(), but the NAPI will already be unavailable. Fixes: d4d558718266 ("idpf: initialize interrupts and enable vport") Reported-by: Michal Kubiak <michal.kubiak@intel.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20240523-net-2024-05-23-intel-net-fixes-v1-1-17a923e0bb5f@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12net: micrel: Fix lan8841_config_intr after getting out of sleep modeHoratiu Vultur1-1/+9
[ Upstream commit 4fb679040d9f758eeb3b4d01bbde6405bf20e64e ] When the interrupt is enabled, the function lan8841_config_intr tries to clear any pending interrupts by reading the interrupt status, then checks the return value for errors and then continue to enable the interrupt. It has been seen that once the system gets out of sleep mode, the interrupt status has the value 0x400 meaning that the PHY detected that the link was in low power. That is correct value but the problem is that the check is wrong. We try to check for errors but we return an error also in this case which is not an error. Therefore fix this by returning only when there is an error. Fixes: a8f1a19d27ef ("net: micrel: Add support for lan8841 PHY") Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Reviewed-by: Suman Ghosh <sumang@marvell.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://lore.kernel.org/r/20240524085350.359812-1-horatiu.vultur@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12net:fec: Add fec_enet_deinit()Xiaolei Wang1-0/+10
[ Upstream commit bf0497f53c8535f99b72041529d3f7708a6e2c0d ] When fec_probe() fails or fec_drv_remove() needs to release the fec queue and remove a NAPI context, therefore add a function corresponding to fec_enet_init() and call fec_enet_deinit() which does the opposite to release memory and remove a NAPI context. Fixes: 59d0f7465644 ("net: fec: init multi queue date structure") Signed-off-by: Xiaolei Wang <xiaolei.wang@windriver.com> Reviewed-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20240524050528.4115581-1-xiaolei.wang@windriver.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>