summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-06-16nfs: fix undefined behavior in nfs_block_bits()Sergey Shtylyov1-2/+2
commit 3c0a2e0b0ae661457c8505fecc7be5501aa7a715 upstream. Shifting *signed int* typed constant 1 left by 31 bits causes undefined behavior. Specify the correct *unsigned long* type by using 1UL instead. Found by Linux Verification Center (linuxtesting.org) with the Svace static analysis tool. Cc: stable@vger.kernel.org Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16EDAC/igen6: Convert PCIBIOS_* return codes to errnosIlpo Järvinen1-2/+2
commit f8367a74aebf88dc8b58a0db6a6c90b4cb8fc9d3 upstream. errcmd_enable_error_reporting() uses pci_{read,write}_config_word() that return PCIBIOS_* codes. The return code is then returned all the way into the probe function igen6_probe() that returns it as is. The probe functions, however, should return normal errnos. Convert PCIBIOS_* returns code using pcibios_err_to_errno() into normal errno before returning it from errcmd_enable_error_reporting(). Fixes: 10590a9d4f23 ("EDAC/igen6: Add EDAC driver for Intel client SoCs using IBECC") Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240527132236.13875-2-ilpo.jarvinen@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16EDAC/amd64: Convert PCIBIOS_* return codes to errnosIlpo Järvinen1-3/+5
commit 3ec8ebd8a5b782d56347ae884de880af26f93996 upstream. gpu_get_node_map() uses pci_read_config_dword() that returns PCIBIOS_* codes. The return code is then returned all the way into the module init function amd64_edac_init() that returns it as is. The module init functions, however, should return normal errnos. Convert PCIBIOS_* returns code using pcibios_err_to_errno() into normal errno before returning it from gpu_get_node_map(). For consistency, convert also the other similar cases which return PCIBIOS_* codes even if they do not have any bugs at the moment. Fixes: 4251566ebc1c ("EDAC/amd64: Cache and use GPU node map") Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240527132236.13875-1-ilpo.jarvinen@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16ALSA: ump: Don't accept an invalid UMP protocol numberTakashi Iwai1-0/+7
commit ac0d71ee534e67c7e53439e8e9cb45ed40731660 upstream. When a UMP Stream Configuration message is received, the driver tries to switch the protocol, but there was no sanity check of the protocol, hence it can pass an invalid value. Add the check and bail out if a wrong value is passed. Fixes: a79807683781 ("ALSA: ump: Add helper to change MIDI protocol") Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20240529164723.18309-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16ALSA: ump: Don't clear bank selection after sending a program changeTakashi Iwai1-1/+0
commit fe85f6e607d75b856e7229924c71f55e005f8284 upstream. The current code clears the bank selection MSB/LSB after sending a program change, but this can be wrong, as many apps may not send the full bank selection with both MSB and LSB but sending only one. Better to keep the previous bank set. Fixes: 0b5288f5fe63 ("ALSA: ump: Add legacy raw MIDI support") Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20240529083823.5778-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16ASoC: SOF: ipc4-topology: Fix input format query of process modules without ↵Peter Ujfalusi1-0/+8
base extension commit ffa077b2f6ad124ec3d23fbddc5e4b0ff2647af8 upstream. If a process module does not have base config extension then the same format applies to all of it's inputs and the process->base_config_ext is NULL, causing NULL dereference when specifically crafted topology and sequences used. Fixes: 648fea128476 ("ASoC: SOF: ipc4-topology: set copier output format for process module") Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Cc: stable@vger.kernel.org Link: https://msgid.link/r/20240529121201.14687-1-peter.ujfalusi@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16genirq/irqdesc: Prevent use-after-free in irq_find_at_or_after()dicken.ding1-1/+4
commit b84a8aba806261d2f759ccedf4a2a6a80a5e55ba upstream. irq_find_at_or_after() dereferences the interrupt descriptor which is returned by mt_find() while neither holding sparse_irq_lock nor RCU read lock, which means the descriptor can be freed between mt_find() and the dereference: CPU0 CPU1 desc = mt_find() delayed_free_desc(desc) irq_desc_get_irq(desc) The use-after-free is reported by KASAN: Call trace: irq_get_next_irq+0x58/0x84 show_stat+0x638/0x824 seq_read_iter+0x158/0x4ec proc_reg_read_iter+0x94/0x12c vfs_read+0x1e0/0x2c8 Freed by task 4471: slab_free_freelist_hook+0x174/0x1e0 __kmem_cache_free+0xa4/0x1dc kfree+0x64/0x128 irq_kobj_release+0x28/0x3c kobject_put+0xcc/0x1e0 delayed_free_desc+0x14/0x2c rcu_do_batch+0x214/0x720 Guard the access with a RCU read lock section. Fixes: 721255b9826b ("genirq: Use a maple tree for interrupt descriptor management") Signed-off-by: dicken.ding <dicken.ding@mediatek.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240524091739.31611-1-dicken.ding@mediatek.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16i3c: master: svc: fix invalidate IBI type and miss call client IBI handlerFrank Li1-3/+13
commit 38baed9b8600008e5d7bc8cb9ceccc1af3dd54b7 upstream. In an In-Band Interrupt (IBI) handle, the code logic is as follows: 1: writel(SVC_I3C_MCTRL_REQUEST_AUTO_IBI | SVC_I3C_MCTRL_IBIRESP_AUTO, master->regs + SVC_I3C_MCTRL); 2: ret = readl_relaxed_poll_timeout(master->regs + SVC_I3C_MSTATUS, val, SVC_I3C_MSTATUS_IBIWON(val), 0, 1000); ... 3: ibitype = SVC_I3C_MSTATUS_IBITYPE(status); ibiaddr = SVC_I3C_MSTATUS_IBIADDR(status); SVC_I3C_MSTATUS_IBIWON may be set before step 1. Thus, step 2 will return immediately, and the I3C controller has not sent out the 9th SCL yet. Consequently, ibitype and ibiaddr are 0, resulting in an unknown IBI type occurrence and missing call I3C client driver's IBI handler. A typical case is that SVC_I3C_MSTATUS_IBIWON is set when an IBI occurs during the controller send start frame in svc_i3c_master_xfer(). Clear SVC_I3C_MSTATUS_IBIWON before issue SVC_I3C_MCTRL_REQUEST_AUTO_IBI to fix this issue. Cc: stable@vger.kernel.org Fixes: 5e5e3c92e748 ("i3c: master: svc: fix wrong data return when IBI happen during start frame") Signed-off-by: Frank Li <Frank.Li@nxp.com> Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/r/20240506164009.21375-3-Frank.Li@nxp.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16s390/cpacf: Make use of invalid opcode produce a link errorHarald Freudenberger1-2/+10
commit 32e8bd6423fc127d2b37bdcf804fd76af3bbec79 upstream. Instead of calling BUG() at runtime introduce and use a prototype for a non-existing function to produce a link error during compile when a not supported opcode is used with the __cpacf_query() or __cpacf_check_opcode() inline functions. Suggested-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Reviewed-by: Juergen Christ <jchrist@linux.ibm.com> Cc: stable@vger.kernel.org Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16s390/cpacf: Split and rework cpacf query functionsHarald Freudenberger1-20/+81
commit 830999bd7e72f4128b9dfa37090d9fa8120ce323 upstream. Rework the cpacf query functions to use the correct RRE or RRF instruction formats and set register fields within instructions correctly. Fixes: 1afd43e0fbba ("s390/crypto: allow to query all known cpacf functions") Reported-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com> Suggested-by: Heiko Carstens <hca@linux.ibm.com> Suggested-by: Juergen Christ <jchrist@linux.ibm.com> Suggested-by: Holger Dengler <dengler@linux.ibm.com> Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Reviewed-by: Juergen Christ <jchrist@linux.ibm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16s390/ap: Fix crash in AP internal function modify_bitmap()Harald Freudenberger1-1/+1
commit d4f9d5a99a3fd1b1c691b7a1a6f8f3f25f4116c9 upstream. A system crash like this Failing address: 200000cb7df6f000 TEID: 200000cb7df6f403 Fault in home space mode while using kernel ASCE. AS:00000002d71bc007 R3:00000003fe5b8007 S:000000011a446000 P:000000015660c13d Oops: 0038 ilc:3 [#1] PREEMPT SMP Modules linked in: mlx5_ib ... CPU: 8 PID: 7556 Comm: bash Not tainted 6.9.0-rc7 #8 Hardware name: IBM 3931 A01 704 (LPAR) Krnl PSW : 0704e00180000000 0000014b75e7b606 (ap_parse_bitmap_str+0x10e/0x1f8) R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3 Krnl GPRS: 0000000000000001 ffffffffffffffc0 0000000000000001 00000048f96b75d3 000000cb00000100 ffffffffffffffff ffffffffffffffff 000000cb7df6fce0 000000cb7df6fce0 00000000ffffffff 000000000000002b 00000048ffffffff 000003ff9b2dbc80 200000cb7df6fcd8 0000014bffffffc0 000000cb7df6fbc8 Krnl Code: 0000014b75e7b5fc: a7840047 brc 8,0000014b75e7b68a 0000014b75e7b600: 18b2 lr %r11,%r2 #0000014b75e7b602: a7f4000a brc 15,0000014b75e7b616 >0000014b75e7b606: eb22d00000e6 laog %r2,%r2,0(%r13) 0000014b75e7b60c: a7680001 lhi %r6,1 0000014b75e7b610: 187b lr %r7,%r11 0000014b75e7b612: 84960021 brxh %r9,%r6,0000014b75e7b654 0000014b75e7b616: 18e9 lr %r14,%r9 Call Trace: [<0000014b75e7b606>] ap_parse_bitmap_str+0x10e/0x1f8 ([<0000014b75e7b5dc>] ap_parse_bitmap_str+0xe4/0x1f8) [<0000014b75e7b758>] apmask_store+0x68/0x140 [<0000014b75679196>] kernfs_fop_write_iter+0x14e/0x1e8 [<0000014b75598524>] vfs_write+0x1b4/0x448 [<0000014b7559894c>] ksys_write+0x74/0x100 [<0000014b7618a440>] __do_syscall+0x268/0x328 [<0000014b761a3558>] system_call+0x70/0x98 INFO: lockdep is turned off. Last Breaking-Event-Address: [<0000014b75e7b636>] ap_parse_bitmap_str+0x13e/0x1f8 Kernel panic - not syncing: Fatal exception: panic_on_oops occured when /sys/bus/ap/a[pq]mask was updated with a relative mask value (like +0x10-0x12,+60,-90) with one of the numeric values exceeding INT_MAX. The fix is simple: use unsigned long values for the internal variables. The correct checks are already in place in the function but a simple int for the internal variables was used with the possibility to overflow. Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com> Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Tested-by: Marc Hartmayer <mhartmay@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16parisc: Define sigset_t in parisc uapi headerHelge Deller2-12/+10
commit 487fa28fa8b60417642ac58e8beda6e2509d18f9 upstream. The util-linux debian package fails to build on parisc, because sigset_t isn't defined in asm/signal.h when included from userspace. Move the sigset_t type from internal header to the uapi header to fix the build. Link: https://buildd.debian.org/status/fetch.php?pkg=util-linux&arch=hppa&ver=2.40-7&stamp=1714163443&raw=0 Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org # v6.0+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16parisc: Define HAVE_ARCH_HUGETLB_UNMAPPED_AREAHelge Deller1-0/+1
commit d4a599910193b85f76c100e30d8551c8794f8c2a upstream. Define the HAVE_ARCH_HUGETLB_UNMAPPED_AREA macro like other platforms do in their page.h files to avoid this compile warning: arch/parisc/mm/hugetlbpage.c:25:1: warning: no previous prototype for 'hugetlb_get_unmapped_area' [-Wmissing-prototypes] Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org # 6.0+ Reported-by: John David Anglin <dave.anglin@bell.net> Tested-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16ARM: dts: samsung: exynos4412-origen: fix keypad no-autorepeatKrzysztof Kozlowski1-1/+1
commit 88208d3cd79821117fd3fb80d9bcab618467d37b upstream. Although the Samsung SoC keypad binding defined linux,keypad-no-autorepeat property, Linux driver never implemented it and always used linux,input-no-autorepeat. Correct the DTS to use property actually implemented. This also fixes dtbs_check errors like: exynos4412-origen.dtb: keypad@100a0000: 'linux,keypad-no-autorepeat' does not match any of the regexes: '^key-[0-9a-z]+$', 'pinctrl-[0-9]+' Cc: <stable@vger.kernel.org> Fixes: bd08f6277e44 ("ARM: dts: Add keypad entries to Exynos4412 based Origen") Link: https://lore.kernel.org/r/20240312183105.715735-2-krzysztof.kozlowski@linaro.org Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16ARM: dts: samsung: smdk4412: fix keypad no-autorepeatKrzysztof Kozlowski1-1/+1
commit 4ac4c1d794e7ff454d191bbdab7585ed8dbf3758 upstream. Although the Samsung SoC keypad binding defined linux,keypad-no-autorepeat property, Linux driver never implemented it and always used linux,input-no-autorepeat. Correct the DTS to use property actually implemented. This also fixes dtbs_check errors like: exynos4412-smdk4412.dtb: keypad@100a0000: 'key-A', 'key-B', 'key-C', 'key-D', 'key-E', 'linux,keypad-no-autorepeat' do not match any of the regexes: '^key-[0-9a-z]+$', 'pinctrl-[0-9]+' Cc: <stable@vger.kernel.org> Fixes: c9b92dd70107 ("ARM: dts: Add keypad entries to SMDK4412") Link: https://lore.kernel.org/r/20240312183105.715735-3-krzysztof.kozlowski@linaro.org Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16ARM: dts: samsung: smdkv310: fix keypad no-autorepeatKrzysztof Kozlowski1-1/+1
commit 87d8e522d6f5a004f0aa06c0def302df65aff296 upstream. Although the Samsung SoC keypad binding defined linux,keypad-no-autorepeat property, Linux driver never implemented it and always used linux,input-no-autorepeat. Correct the DTS to use property actually implemented. This also fixes dtbs_check errors like: exynos4210-smdkv310.dtb: keypad@100a0000: 'linux,keypad-no-autorepeat' does not match any of the regexes: '^key-[0-9a-z]+$', 'pinctrl-[0-9]+' Cc: <stable@vger.kernel.org> Fixes: 0561ceabd0f1 ("ARM: dts: Add intial dts file for EXYNOS4210 SoC, SMDKV310 and ORIGEN") Link: https://lore.kernel.org/r/20240312183105.715735-1-krzysztof.kozlowski@linaro.org Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16riscv: dts: starfive: Remove PMIC interrupt info for Visionfive 2 boardShengyu Qu1-1/+0
commit 0f74c64f0a9f6e1e7cf17bea3d4350fa6581e0d7 upstream. Interrupt line number of the AXP15060 PMIC is not a necessary part of its device tree. Originally the binding required one, so the dts patch added an invalid interrupt that the driver ignored (0) as the interrupt line of the PMIC is not actually connected on this platform. This went unnoticed during review as it would have been a valid interrupt for a GPIO controller, but it is not for the PLIC. The PLIC, on this platform at least, silently ignores the enablement of interrupt 0. Bo Gan is running a modified version of OpenSBI that faults if writes are done to reserved fields, so their kernel runs into problems. Delete the invalid interrupt from the device tree. Cc: stable@vger.kernel.org Reported-by: Bo Gan <ganboing@gmail.com> Link: https://lore.kernel.org/all/c8b6e960-2459-130f-e4e4-7c9c2ebaa6d3@gmail.com/ Signed-off-by: Shengyu Qu <wiagn233@outlook.com> Fixes: 2378341504de ("riscv: dts: starfive: Enable axp15060 pmic for cpufreq") [conor: rewrite the commit message to add more detail] Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16ext4: fix mb_cache_entry's e_refcnt leak in ext4_xattr_block_cache_find()Baokun Li1-1/+3
commit 0c0b4a49d3e7f49690a6827a41faeffad5df7e21 upstream. Syzbot reports a warning as follows: ============================================ WARNING: CPU: 0 PID: 5075 at fs/mbcache.c:419 mb_cache_destroy+0x224/0x290 Modules linked in: CPU: 0 PID: 5075 Comm: syz-executor199 Not tainted 6.9.0-rc6-gb947cc5bf6d7 RIP: 0010:mb_cache_destroy+0x224/0x290 fs/mbcache.c:419 Call Trace: <TASK> ext4_put_super+0x6d4/0xcd0 fs/ext4/super.c:1375 generic_shutdown_super+0x136/0x2d0 fs/super.c:641 kill_block_super+0x44/0x90 fs/super.c:1675 ext4_kill_sb+0x68/0xa0 fs/ext4/super.c:7327 [...] ============================================ This is because when finding an entry in ext4_xattr_block_cache_find(), if ext4_sb_bread() returns -ENOMEM, the ce's e_refcnt, which has already grown in the __entry_find(), won't be put away, and eventually trigger the above issue in mb_cache_destroy() due to reference count leakage. So call mb_cache_entry_put() on the -ENOMEM error branch as a quick fix. Reported-by: syzbot+dd43bd0f7474512edc47@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=dd43bd0f7474512edc47 Fixes: fb265c9cb49e ("ext4: add ext4_sb_bread() to disambiguate ENOMEM cases") Cc: stable@kernel.org Signed-off-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20240504075526.2254349-2-libaokun@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16ext4: set type of ac_groups_linear_remaining to __u32 to avoid overflowBaokun Li1-1/+1
commit 9a9f3a9842927e4af7ca10c19c94dad83bebd713 upstream. Now ac_groups_linear_remaining is of type __u16 and s_mb_max_linear_groups is of type unsigned int, so an overflow occurs when setting a value above 65535 through the mb_max_linear_groups sysfs interface. Therefore, the type of ac_groups_linear_remaining is set to __u32 to avoid overflow. Fixes: 196e402adf2e ("ext4: improve cr 0 / cr 1 group scanning") CC: stable@kernel.org Signed-off-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20240319113325.3110393-8-libaokun1@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16ext4: Fixes len calculation in mpage_journal_page_buffersRitesh Harjani (IBM)1-1/+1
commit c2a09f3d782de952f09a3962d03b939e7fa7ffa4 upstream. Truncate operation can race with writeback, in which inode->i_size can get truncated and therefore size - folio_pos() can be negative. This fixes the len calculation. However this path doesn't get easily triggered even with data journaling. Cc: stable@kernel.org # v6.5 Fixes: 80be8c5cc925 ("Fixes: ext4: Make mpage_journal_page_buffers use folio") Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/cff4953b5c9306aba71e944ab176a5d396b9a1b7.1709182250.git.ritesh.list@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16drm/amdkfd: handle duplicate BOs in reserve_bo_and_cond_vmsLang Yu1-1/+2
commit 2a705f3e49d20b59cd9e5cc3061b2d92ebe1e5f0 upstream. Observed on gfx8 ASIC where KFD_IOC_ALLOC_MEM_FLAGS_AQL_QUEUE_MEM is used. Two attachments use the same VM, root PD would be locked twice. [ 57.910418] Call Trace: [ 57.793726] ? reserve_bo_and_cond_vms+0x111/0x1c0 [amdgpu] [ 57.793820] amdgpu_amdkfd_gpuvm_unmap_memory_from_gpu+0x6c/0x1c0 [amdgpu] [ 57.793923] ? idr_get_next_ul+0xbe/0x100 [ 57.793933] kfd_process_device_free_bos+0x7e/0xf0 [amdgpu] [ 57.794041] kfd_process_wq_release+0x2ae/0x3c0 [amdgpu] [ 57.794141] ? process_scheduled_works+0x29c/0x580 [ 57.794147] process_scheduled_works+0x303/0x580 [ 57.794157] ? __pfx_worker_thread+0x10/0x10 [ 57.794160] worker_thread+0x1a2/0x370 [ 57.794165] ? __pfx_worker_thread+0x10/0x10 [ 57.794167] kthread+0x11b/0x150 [ 57.794172] ? __pfx_kthread+0x10/0x10 [ 57.794177] ret_from_fork+0x3d/0x60 [ 57.794181] ? __pfx_kthread+0x10/0x10 [ 57.794184] ret_from_fork_asm+0x1b/0x30 Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.6.x only Signed-off-by: Tomáš Trnka <trnka@scm.com> [TT: trivially adjusted for 6.6 which does not have commit 05d249352f (third argument to drm_exec_init removed)] Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16sparc: move struct termio to asm/termios.hMike Gilbert2-10/+9
commit c32d18e7942d7589b62e301eb426b32623366565 upstream. Every other arch declares struct termio in asm/termios.h, so make sparc match them. Resolves a build failure in the PPP software package, which includes both bits/ioctl-types.h via sys/ioctl.h (glibc) and asm/termbits.h. Closes: https://bugs.gentoo.org/918992 Signed-off-by: Mike Gilbert <floppym@gentoo.org> Cc: stable@vger.kernel.org Reviewed-by: Andreas Larsson <andreas@gaisler.com> Tested-by: Andreas Larsson <andreas@gaisler.com> Link: https://lore.kernel.org/r/20240306171149.3843481-1-floppym@gentoo.org Signed-off-by: Andreas Larsson <andreas@gaisler.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16net: fix __dst_negative_advice() raceEric Dumazet5-47/+30
commit 92f1655aa2b2294d0b49925f3b875a634bd3b59e upstream. __dst_negative_advice() does not enforce proper RCU rules when sk->dst_cache must be cleared, leading to possible UAF. RCU rules are that we must first clear sk->sk_dst_cache, then call dst_release(old_dst). Note that sk_dst_reset(sk) is implementing this protocol correctly, while __dst_negative_advice() uses the wrong order. Given that ip6_negative_advice() has special logic against RTF_CACHE, this means each of the three ->negative_advice() existing methods must perform the sk_dst_reset() themselves. Note the check against NULL dst is centralized in __dst_negative_advice(), there is no need to duplicate it in various callbacks. Many thanks to Clement Lecigne for tracking this issue. This old bug became visible after the blamed commit, using UDP sockets. Fixes: a87cb3e48ee8 ("net: Facility to report route quality of connected sockets") Reported-by: Clement Lecigne <clecigne@google.com> Diagnosed-by: Clement Lecigne <clecigne@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tom Herbert <tom@herbertland.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20240528114353.1794151-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> [Lee: Stable backport] Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16kdb: Use format-specifiers rather than memset() for padding in kdb_read()Daniel Thompson1-5/+3
commit c9b51ddb66b1d96e4d364c088da0f1dfb004c574 upstream. Currently when the current line should be removed from the display kdb_read() uses memset() to fill a temporary buffer with spaces. The problem is not that this could be trivially implemented using a format string rather than open coding it. The real problem is that it is possible, on systems with a long kdb_prompt_str, to write past the end of the tmpbuffer. Happily, as mentioned above, this can be trivially implemented using a format string. Make it so! Cc: stable@vger.kernel.org Reviewed-by: Douglas Anderson <dianders@chromium.org> Tested-by: Justin Stitt <justinstitt@google.com> Link: https://lore.kernel.org/r/20240424-kgdb_read_refactor-v3-5-f236dbe9828d@linaro.org Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16kdb: Merge identical case statements in kdb_read()Daniel Thompson1-9/+1
commit 6244917f377bf64719551b58592a02a0336a7439 upstream. The code that handles case 14 (down) and case 16 (up) has been copy and pasted despite being byte-for-byte identical. Combine them. Cc: stable@vger.kernel.org # Not a bug fix but it is needed for later bug fixes Reviewed-by: Douglas Anderson <dianders@chromium.org> Tested-by: Justin Stitt <justinstitt@google.com> Link: https://lore.kernel.org/r/20240424-kgdb_read_refactor-v3-4-f236dbe9828d@linaro.org Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16kdb: Fix console handling when editing and tab-completing commandsDaniel Thompson1-0/+5
commit db2f9c7dc29114f531df4a425d0867d01e1f1e28 upstream. Currently, if the cursor position is not at the end of the command buffer and the user uses the Tab-complete functions, then the console does not leave the cursor in the correct position. For example consider the following buffer with the cursor positioned at the ^: md kdb_pro 10 ^ Pressing tab should result in: md kdb_prompt_str 10 ^ However this does not happen. Instead the cursor is placed at the end (after then 10) and further cursor movement redraws incorrectly. The same problem exists when we double-Tab but in a different part of the code. Fix this by sending a carriage return and then redisplaying the text to the left of the cursor. Cc: stable@vger.kernel.org Reviewed-by: Douglas Anderson <dianders@chromium.org> Tested-by: Justin Stitt <justinstitt@google.com> Link: https://lore.kernel.org/r/20240424-kgdb_read_refactor-v3-3-f236dbe9828d@linaro.org Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16kdb: Use format-strings rather than '\0' injection in kdb_read()Daniel Thompson1-21/+34
commit 09b35989421dfd5573f0b4683c7700a7483c71f9 upstream. Currently when kdb_read() needs to reposition the cursor it uses copy and paste code that works by injecting an '\0' at the cursor position before delivering a carriage-return and reprinting the line (which stops at the '\0'). Tidy up the code by hoisting the copy and paste code into an appropriately named function. Additionally let's replace the '\0' injection with a proper field width parameter so that the string will be abridged during formatting instead. Cc: stable@vger.kernel.org # Not a bug fix but it is needed for later bug fixes Tested-by: Justin Stitt <justinstitt@google.com> Reviewed-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20240424-kgdb_read_refactor-v3-2-f236dbe9828d@linaro.org Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16kdb: Fix buffer overflow during tab-completeDaniel Thompson1-8/+13
commit e9730744bf3af04cda23799029342aa3cddbc454 upstream. Currently, when the user attempts symbol completion with the Tab key, kdb will use strncpy() to insert the completed symbol into the command buffer. Unfortunately it passes the size of the source buffer rather than the destination to strncpy() with predictably horrible results. Most obviously if the command buffer is already full but cp, the cursor position, is in the middle of the buffer, then we will write past the end of the supplied buffer. Fix this by replacing the dubious strncpy() calls with memmove()/memcpy() calls plus explicit boundary checks to make sure we have enough space before we start moving characters around. Reported-by: Justin Stitt <justinstitt@google.com> Closes: https://lore.kernel.org/all/CAFhGd8qESuuifuHsNjFPR-Va3P80bxrw+LqvC8deA8GziUJLpw@mail.gmail.com/ Cc: stable@vger.kernel.org Reviewed-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Justin Stitt <justinstitt@google.com> Tested-by: Justin Stitt <justinstitt@google.com> Link: https://lore.kernel.org/r/20240424-kgdb_read_refactor-v3-1-f236dbe9828d@linaro.org Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16wifi: ath10k: fix QCOM_RPROC_COMMON dependencyDmitry Baryshkov1-0/+1
commit 21ae74e1bf18331ae5e279bd96304b3630828009 upstream. If ath10k_snoc is built-in, while Qualcomm remoteprocs are built as modules, compilation fails with: /usr/bin/aarch64-linux-gnu-ld: drivers/net/wireless/ath/ath10k/snoc.o: in function `ath10k_modem_init': drivers/net/wireless/ath/ath10k/snoc.c:1534: undefined reference to `qcom_register_ssr_notifier' /usr/bin/aarch64-linux-gnu-ld: drivers/net/wireless/ath/ath10k/snoc.o: in function `ath10k_modem_deinit': drivers/net/wireless/ath/ath10k/snoc.c:1551: undefined reference to `qcom_unregister_ssr_notifier' Add corresponding dependency to ATH10K_SNOC Kconfig entry so that it's built as module if QCOM_RPROC_COMMON is built as module too. Fixes: 747ff7d3d742 ("ath10k: Don't always treat modem stop events as crashes") Cc: stable@vger.kernel.org Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com> Link: https://msgid.link/20240511-ath10k-snoc-dep-v1-1-9666e3af5c27@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16bonding: fix oops during rmmodTony Battersby1-6/+7
commit a45835a0bb6ef7d5ddbc0714dd760de979cb6ece upstream. "rmmod bonding" causes an oops ever since commit cc317ea3d927 ("bonding: remove redundant NULL check in debugfs function"). Here are the relevant functions being called: bonding_exit() bond_destroy_debugfs() debugfs_remove_recursive(bonding_debug_root); bonding_debug_root = NULL; <--------- SET TO NULL HERE bond_netlink_fini() rtnl_link_unregister() __rtnl_link_unregister() unregister_netdevice_many_notify() bond_uninit() bond_debug_unregister() (commit removed check for bonding_debug_root == NULL) debugfs_remove() simple_recursive_removal() down_write() -> OOPS However, reverting the bad commit does not solve the problem completely because the original code contains a race that could cause the same oops, although it was much less likely to be triggered unintentionally: CPU1 rmmod bonding bonding_exit() bond_destroy_debugfs() debugfs_remove_recursive(bonding_debug_root); CPU2 echo -bond0 > /sys/class/net/bonding_masters bond_uninit() bond_debug_unregister() if (!bonding_debug_root) CPU1 bonding_debug_root = NULL; So do NOT revert the bad commit (since the removed checks were racy anyway), and instead change the order of actions taken during module removal. The same oops can also happen if there is an error during module init, so apply the same fix there. Fixes: cc317ea3d927 ("bonding: remove redundant NULL check in debugfs function") Cc: stable@vger.kernel.org Signed-off-by: Tony Battersby <tonyb@cybernetics.com> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com> Link: https://lore.kernel.org/r/641f914f-3216-4eeb-87dd-91b78aa97773@cybernetics.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16watchdog: rti_wdt: Set min_hw_heartbeat_ms to accommodate a safety marginJudith Mendez1-19/+15
commit cae58516534e110f4a8558d48aa4435e15519121 upstream. On AM62x, the watchdog is pet before the valid window is open. Fix min_hw_heartbeat and accommodate a 2% + static offset safety margin. The static offset accounts for max hardware error. Remove the hack in the driver which shifts the open window boundary, since it is no longer necessary due to the fix mentioned above. cc: stable@vger.kernel.org Fixes: 5527483f8f7c ("watchdog: rti-wdt: attach to running watchdog during probe") Signed-off-by: Judith Mendez <jm@ti.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20240417205700.3947408-1-jm@ti.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16selftests/mm: fix build warnings on ppc64Michael Ellerman2-0/+2
commit 1901472fa880e5706f90926cd85a268d2d16bf84 upstream. Fix warnings like: In file included from uffd-unit-tests.c:8: uffd-unit-tests.c: In function `uffd_poison_handle_fault': uffd-common.h:45:33: warning: format `%llu' expects argument of type `long long unsigned int', but argument 3 has type `__u64' {aka `long unsigned int'} [-Wformat=] By switching to unsigned long long for u64 for ppc64 builds. Link: https://lkml.kernel.org/r/20240521030219.57439-1-mpe@ellerman.id.au Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16selftests/mm: compaction_test: fix incorrect write of zero to nr_hugepagesDev Jain1-0/+2
commit 9ad665ef55eaad1ead1406a58a34f615a7c18b5e upstream. Currently, the test tries to set nr_hugepages to zero, but that is not actually done because the file offset is not reset after read(). Fix that using lseek(). Link: https://lkml.kernel.org/r/20240521074358.675031-3-dev.jain@arm.com Fixes: bd67d5c15cc1 ("Test compaction of mlocked memory") Signed-off-by: Dev Jain <dev.jain@arm.com> Cc: <stable@vger.kernel.org> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Sri Jayaramappa <sjayaram@akamai.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16mm/vmalloc: fix vmalloc which may return null if called with __GFP_NOFAILHailong.Liu1-3/+2
commit 8e0545c83d672750632f46e3f9ad95c48c91a0fc upstream. commit a421ef303008 ("mm: allow !GFP_KERNEL allocations for kvmalloc") includes support for __GFP_NOFAIL, but it presents a conflict with commit dd544141b9eb ("vmalloc: back off when the current task is OOM-killed"). A possible scenario is as follows: process-a __vmalloc_node_range(GFP_KERNEL | __GFP_NOFAIL) __vmalloc_area_node() vm_area_alloc_pages() --> oom-killer send SIGKILL to process-a if (fatal_signal_pending(current)) break; --> return NULL; To fix this, do not check fatal_signal_pending() in vm_area_alloc_pages() if __GFP_NOFAIL set. This issue occurred during OPLUS KASAN TEST. Below is part of the log -> oom-killer sends signal to process [65731.222840] [ T1308] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/apps/uid_10198,task=gs.intelligence,pid=32454,uid=10198 [65731.259685] [T32454] Call trace: [65731.259698] [T32454] dump_backtrace+0xf4/0x118 [65731.259734] [T32454] show_stack+0x18/0x24 [65731.259756] [T32454] dump_stack_lvl+0x60/0x7c [65731.259781] [T32454] dump_stack+0x18/0x38 [65731.259800] [T32454] mrdump_common_die+0x250/0x39c [mrdump] [65731.259936] [T32454] ipanic_die+0x20/0x34 [mrdump] [65731.260019] [T32454] atomic_notifier_call_chain+0xb4/0xfc [65731.260047] [T32454] notify_die+0x114/0x198 [65731.260073] [T32454] die+0xf4/0x5b4 [65731.260098] [T32454] die_kernel_fault+0x80/0x98 [65731.260124] [T32454] __do_kernel_fault+0x160/0x2a8 [65731.260146] [T32454] do_bad_area+0x68/0x148 [65731.260174] [T32454] do_mem_abort+0x151c/0x1b34 [65731.260204] [T32454] el1_abort+0x3c/0x5c [65731.260227] [T32454] el1h_64_sync_handler+0x54/0x90 [65731.260248] [T32454] el1h_64_sync+0x68/0x6c [65731.260269] [T32454] z_erofs_decompress_queue+0x7f0/0x2258 --> be->decompressed_pages = kvcalloc(be->nr_pages, sizeof(struct page *), GFP_KERNEL | __GFP_NOFAIL); kernel panic by NULL pointer dereference. erofs assume kvmalloc with __GFP_NOFAIL never return NULL. [65731.260293] [T32454] z_erofs_runqueue+0xf30/0x104c [65731.260314] [T32454] z_erofs_readahead+0x4f0/0x968 [65731.260339] [T32454] read_pages+0x170/0xadc [65731.260364] [T32454] page_cache_ra_unbounded+0x874/0xf30 [65731.260388] [T32454] page_cache_ra_order+0x24c/0x714 [65731.260411] [T32454] filemap_fault+0xbf0/0x1a74 [65731.260437] [T32454] __do_fault+0xd0/0x33c [65731.260462] [T32454] handle_mm_fault+0xf74/0x3fe0 [65731.260486] [T32454] do_mem_abort+0x54c/0x1b34 [65731.260509] [T32454] el0_da+0x44/0x94 [65731.260531] [T32454] el0t_64_sync_handler+0x98/0xb4 [65731.260553] [T32454] el0t_64_sync+0x198/0x19c Link: https://lkml.kernel.org/r/20240510100131.1865-1-hailong.liu@oppo.com Fixes: 9376130c390a ("mm/vmalloc: add support for __GFP_NOFAIL") Signed-off-by: Hailong.Liu <hailong.liu@oppo.com> Acked-by: Michal Hocko <mhocko@suse.com> Suggested-by: Barry Song <21cnbao@gmail.com> Reported-by: Oven <liyangouwen1@oppo.com> Reviewed-by: Barry Song <baohua@kernel.org> Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Cc: Chao Yu <chao@kernel.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Gao Xiang <xiang@kernel.org> Cc: Lorenzo Stoakes <lstoakes@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16mm: /proc/pid/smaps_rollup: avoid skipping vma after getting mmap_lock againYuanyuan Zhong1-2/+7
commit 6d065f507d82307d6161ac75c025111fb8b08a46 upstream. After switching smaps_rollup to use VMA iterator, searching for next entry is part of the condition expression of the do-while loop. So the current VMA needs to be addressed before the continue statement. Otherwise, with some VMAs skipped, userspace observed memory consumption from /proc/pid/smaps_rollup will be smaller than the sum of the corresponding fields from /proc/pid/smaps. Link: https://lkml.kernel.org/r/20240523183531.2535436-1-yzhong@purestorage.com Fixes: c4c84f06285e ("fs/proc/task_mmu: stop using linked list and highest_vm_end") Signed-off-by: Yuanyuan Zhong <yzhong@purestorage.com> Reviewed-by: Mohamed Khalfella <mkhalfella@purestorage.com> Cc: David Hildenbrand <david@redhat.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16mm/hugetlb: pass correct order_per_bit to cma_declare_contiguous_nidFrank van der Linden1-3/+3
commit 55d134a7b499c77e7cfd0ee41046f3c376e791e5 upstream. The hugetlb_cma code passes 0 in the order_per_bit argument to cma_declare_contiguous_nid (the alignment, computed using the page order, is correctly passed in). This causes a bit in the cma allocation bitmap to always represent a 4k page, making the bitmaps potentially very large, and slower. It would create bitmaps that would be pretty big. E.g. for a 4k page size on x86, hugetlb_cma=64G would mean a bitmap size of (64G / 4k) / 8 == 2M. With HUGETLB_PAGE_ORDER as order_per_bit, as intended, this would be (64G / 2M) / 8 == 4k. So, that's quite a difference. Also, this restricted the hugetlb_cma area to ((PAGE_SIZE << MAX_PAGE_ORDER) * 8) * PAGE_SIZE (e.g. 128G on x86) , since bitmap_alloc uses normal page allocation, and is thus restricted by MAX_PAGE_ORDER. Specifying anything about that would fail the CMA initialization. So, correctly pass in the order instead. Link: https://lkml.kernel.org/r/20240404162515.527802-2-fvdl@google.com Fixes: cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic hugepages using cma") Signed-off-by: Frank van der Linden <fvdl@google.com> Acked-by: Roman Gushchin <roman.gushchin@linux.dev> Acked-by: David Hildenbrand <david@redhat.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16mm/cma: drop incorrect alignment check in cma_init_reserved_memFrank van der Linden1-4/+0
commit b174f139bdc8aaaf72f5b67ad1bd512c4868a87e upstream. cma_init_reserved_mem uses IS_ALIGNED to check if the size represented by one bit in the cma allocation bitmask is aligned with CMA_MIN_ALIGNMENT_BYTES (pageblock size). However, this is too strict, as this will fail if order_per_bit > pageblock_order, which is a valid configuration. We could check IS_ALIGNED both ways, but since both numbers are powers of two, no check is needed at all. Link: https://lkml.kernel.org/r/20240404162515.527802-1-fvdl@google.com Fixes: de9e14eebf33 ("drivers: dma-contiguous: add initialization from device tree") Signed-off-by: Frank van der Linden <fvdl@google.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16sparc64: Fix number of online CPUsSam Ravnborg4-18/+3
commit 98937707fea8375e8acea0aaa0b68a956dd52719 upstream. Nick Bowler reported: When using newer kernels on my Ultra 60 with dual 450MHz UltraSPARC-II CPUs, I noticed that only CPU 0 comes up, while older kernels (including 4.7) are working fine with both CPUs. I bisected the failure to this commit: 9b2f753ec23710aa32c0d837d2499db92fe9115b is the first bad commit commit 9b2f753ec23710aa32c0d837d2499db92fe9115b Author: Atish Patra <atish.patra@oracle.com> Date: Thu Sep 15 14:54:40 2016 -0600 sparc64: Fix cpu_possible_mask if nr_cpus is set This is a small change that reverts very easily on top of 5.18: there is just one trivial conflict. Once reverted, both CPUs work again. Maybe this is related to the fact that the CPUs on this system are numbered CPU0 and CPU2 (there is no CPU1)? The current code that adjust cpu_possible based on nr_cpu_ids do not take into account that CPU's may not come one after each other. Move the chech to the function that setup the cpu_possible mask so there is no need to adjust it later. Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Fixes: 9b2f753ec237 ("sparc64: Fix cpu_possible_mask if nr_cpus is set") Reported-by: Nick Bowler <nbowler@draconx.ca> Tested-by: Nick Bowler <nbowler@draconx.ca> Link: https://lore.kernel.org/sparclinux/20201009161924.c8f031c079dd852941307870@gmx.de/ Link: https://lore.kernel.org/all/CADyTPEwt=ZNams+1bpMB1F9w_vUdPsGCt92DBQxxq_VtaLoTdw@mail.gmail.com/ Cc: stable@vger.kernel.org # v4.8+ Cc: Andreas Larsson <andreas@gaisler.com> Cc: David S. Miller <davem@davemloft.net> Cc: Atish Patra <atish.patra@oracle.com> Cc: Bob Picco <bob.picco@oracle.com> Cc: Vijay Kumar <vijay.ac.kumar@oracle.com> Cc: David S. Miller <davem@davemloft.net> Reviewed-by: Andreas Larsson <andreas@gaisler.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20240330-sparc64-warnings-v1-9-37201023ee2f@ravnborg.org Signed-off-by: Andreas Larsson <andreas@gaisler.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16rtla/timerlat: Fix histogram report when a cpu count is 0John Kacur1-18/+42
commit 01b05fc0e5f3aec443a9a8ffa0022cbca2fd3608 upstream. On short runs it is possible to get no samples on a cpu, like this: # rtla timerlat hist -u -T50 Index IRQ-001 Thr-001 Usr-001 IRQ-002 Thr-002 Usr-002 2 1 0 0 0 0 0 33 0 1 0 0 0 0 36 0 0 1 0 0 0 49 0 0 0 1 0 0 52 0 0 0 0 1 0 over: 0 0 0 0 0 0 count: 1 1 1 1 1 0 min: 2 33 36 49 52 18446744073709551615 avg: 2 33 36 49 52 - max: 2 33 36 49 52 0 rtla timerlat hit stop tracing IRQ handler delay: (exit from idle) 48.21 us (91.09 %) IRQ latency: 49.11 us Timerlat IRQ duration: 2.17 us (4.09 %) Blocking thread: 1.01 us (1.90 %) swapper/2:0 1.01 us ------------------------------------------------------------------------ Thread latency: 52.93 us (100%) Max timerlat IRQ latency from idle: 49.11 us in cpu 2 Note, the value 18446744073709551615 is the same as ~0. Fix this by reporting no results for the min, avg and max if the count is 0. Link: https://lkml.kernel.org/r/20240510190318.44295-1-jkacur@redhat.com Cc: stable@vger.kernel.org Fixes: 1eeb6328e8b3 ("rtla/timerlat: Add timerlat hist mode") Suggested-by: Daniel Bristot de Oliveria <bristot@kernel.org> Signed-off-by: John Kacur <jkacur@redhat.com> Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16intel_th: pci: Add Meteor Lake-S CPU supportAlexander Shishkin1-0/+5
commit a4f813c3ec9d1c32bc402becd1f011b3904dd699 upstream. Add support for the Trace Hub in Meteor Lake-S CPU. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: stable@kernel.org Link: https://lore.kernel.org/r/20240429130119.1518073-15-alexander.shishkin@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16cpufreq: amd-pstate: Fix the inconsistency in max frequency unitsDhananjay Ugwekar1-1/+1
commit e4731baaf29438508197d3a8a6d4f5a8c51663f8 upstream. The nominal frequency in cpudata is maintained in MHz whereas all other frequencies are in KHz. This means we have to convert nominal frequency value to KHz before we do any interaction with other frequency values. In amd_pstate_set_boost(), this conversion from MHz to KHz is missed, fix that. Tested on a AMD Zen4 EPYC server Before: $ cat /sys/devices/system/cpu/cpufreq/policy*/scaling_max_freq | uniq 2151 $ cat /sys/devices/system/cpu/cpufreq/policy*/cpuinfo_min_freq | uniq 400000 $ cat /sys/devices/system/cpu/cpufreq/policy*/scaling_cur_freq | uniq 2151 409422 After: $ cat /sys/devices/system/cpu/cpufreq/policy*/scaling_max_freq | uniq 2151000 $ cat /sys/devices/system/cpu/cpufreq/policy*/cpuinfo_min_freq | uniq 400000 $ cat /sys/devices/system/cpu/cpufreq/policy*/scaling_cur_freq | uniq 2151000 1799527 Fixes: ec437d71db77 ("cpufreq: amd-pstate: Introduce a new AMD P-State driver to support future processors") Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com> Acked-by: Mario Limonciello <mario.limonciello@amd.com> Acked-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Tested-by: Peter Jung <ptr1337@cachyos.org> Cc: 5.17+ <stable@vger.kernel.org> # 5.17+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16tpm_tis: Do *not* flush uninitialized workJan Beulich1-1/+2
commit 0ea00e249ca992adee54dc71a526ee70ef109e40 upstream. tpm_tis_core_init() may fail before tpm_tis_probe_irq_single() is called, in which case tpm_tis_remove() unconditionally calling flush_work() is triggering a warning for .func still being NULL. Cc: stable@vger.kernel.org # v6.5+ Fixes: 481c2d14627d ("tpm,tpm_tis: Disable interrupts after 1000 unhandled IRQs") Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16kmsan: do not wipe out origin when doing partial unpoisoningAlexander Potapenko1-4/+11
commit 2ef3cec44c60ae171b287db7fc2aa341586d65ba upstream. As noticed by Brian, KMSAN should not be zeroing the origin when unpoisoning parts of a four-byte uninitialized value, e.g.: char a[4]; kmsan_unpoison_memory(a, 1); This led to false negatives, as certain poisoned values could receive zero origins, preventing those values from being reported. To fix the problem, check that kmsan_internal_set_shadow_origin() writes zero origins only to slots which have zero shadow. Link: https://lkml.kernel.org/r/20240528104807.738758-1-glider@google.com Fixes: f80be4571b19 ("kmsan: add KMSAN runtime core") Signed-off-by: Alexander Potapenko <glider@google.com> Reported-by: Brian Johannesmeyer <bjohannesmeyer@gmail.com> Link: https://lore.kernel.org/lkml/20240524232804.1984355-1-bjohannesmeyer@gmail.com/T/ Reviewed-by: Marco Elver <elver@google.com> Tested-by: Brian Johannesmeyer <bjohannesmeyer@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16mm/ksm: fix ksm_zero_pages accountingChengming Zhou4-11/+21
commit c2dc78b86e0821ecf9a9d0c35dba2618279a5bb6 upstream. We normally ksm_zero_pages++ in ksmd when page is merged with zero page, but ksm_zero_pages-- is done from page tables side, where there is no any accessing protection of ksm_zero_pages. So we can read very exceptional value of ksm_zero_pages in rare cases, such as -1, which is very confusing to users. Fix it by changing to use atomic_long_t, and the same case with the mm->ksm_zero_pages. Link: https://lkml.kernel.org/r/20240528-b4-ksm-counters-v3-2-34bb358fdc13@linux.dev Fixes: e2942062e01d ("ksm: count all zero pages placed by KSM") Fixes: 6080d19f0704 ("ksm: add ksm zero pages for each process") Signed-off-by: Chengming Zhou <chengming.zhou@linux.dev> Acked-by: David Hildenbrand <david@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ran Xiaokai <ran.xiaokai@zte.com.cn> Cc: Stefan Roesch <shr@devkernel.io> Cc: xu xin <xu.xin16@zte.com.cn> Cc: Yang Yang <yang.yang29@zte.com.cn> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16mm/ksm: fix ksm_pages_scanned accountingChengming Zhou1-4/+2
commit 730cdc2c72c6905a2eda2fccbbf67dcef1206590 upstream. Patch series "mm/ksm: fix some accounting problems", v3. We encountered some abnormal ksm_pages_scanned and ksm_zero_pages during some random tests. 1. ksm_pages_scanned unchanged even ksmd scanning has progress. 2. ksm_zero_pages maybe -1 in some rare cases. This patch (of 2): During testing, I found ksm_pages_scanned is unchanged although the scan_get_next_rmap_item() did return valid rmap_item that is not NULL. The reason is the scan_get_next_rmap_item() will return NULL after a full scan, so ksm_do_scan() just return without accounting of the ksm_pages_scanned. Fix it by just putting ksm_pages_scanned accounting in that loop, and it will be accounted more timely if that loop would last for a long time. Link: https://lkml.kernel.org/r/20240528-b4-ksm-counters-v3-0-34bb358fdc13@linux.dev Link: https://lkml.kernel.org/r/20240528-b4-ksm-counters-v3-1-34bb358fdc13@linux.dev Fixes: b348b5fe2b5f ("mm/ksm: add pages scanned metric") Signed-off-by: Chengming Zhou <chengming.zhou@linux.dev> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: xu xin <xu.xin16@zte.com.cn> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ran Xiaokai <ran.xiaokai@zte.com.cn> Cc: Stefan Roesch <shr@devkernel.io> Cc: Yang Yang <yang.yang29@zte.com.cn> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16net/9p: fix uninit-value in p9_client_rpc()Nikita Zhandarovich1-0/+2
commit 25460d6f39024cc3b8241b14c7ccf0d6f11a736a upstream. Syzbot with the help of KMSAN reported the following error: BUG: KMSAN: uninit-value in trace_9p_client_res include/trace/events/9p.h:146 [inline] BUG: KMSAN: uninit-value in p9_client_rpc+0x1314/0x1340 net/9p/client.c:754 trace_9p_client_res include/trace/events/9p.h:146 [inline] p9_client_rpc+0x1314/0x1340 net/9p/client.c:754 p9_client_create+0x1551/0x1ff0 net/9p/client.c:1031 v9fs_session_init+0x1b9/0x28e0 fs/9p/v9fs.c:410 v9fs_mount+0xe2/0x12b0 fs/9p/vfs_super.c:122 legacy_get_tree+0x114/0x290 fs/fs_context.c:662 vfs_get_tree+0xa7/0x570 fs/super.c:1797 do_new_mount+0x71f/0x15e0 fs/namespace.c:3352 path_mount+0x742/0x1f20 fs/namespace.c:3679 do_mount fs/namespace.c:3692 [inline] __do_sys_mount fs/namespace.c:3898 [inline] __se_sys_mount+0x725/0x810 fs/namespace.c:3875 __x64_sys_mount+0xe4/0x150 fs/namespace.c:3875 do_syscall_64+0xd5/0x1f0 entry_SYSCALL_64_after_hwframe+0x6d/0x75 Uninit was created at: __alloc_pages+0x9d6/0xe70 mm/page_alloc.c:4598 __alloc_pages_node include/linux/gfp.h:238 [inline] alloc_pages_node include/linux/gfp.h:261 [inline] alloc_slab_page mm/slub.c:2175 [inline] allocate_slab mm/slub.c:2338 [inline] new_slab+0x2de/0x1400 mm/slub.c:2391 ___slab_alloc+0x1184/0x33d0 mm/slub.c:3525 __slab_alloc mm/slub.c:3610 [inline] __slab_alloc_node mm/slub.c:3663 [inline] slab_alloc_node mm/slub.c:3835 [inline] kmem_cache_alloc+0x6d3/0xbe0 mm/slub.c:3852 p9_tag_alloc net/9p/client.c:278 [inline] p9_client_prepare_req+0x20a/0x1770 net/9p/client.c:641 p9_client_rpc+0x27e/0x1340 net/9p/client.c:688 p9_client_create+0x1551/0x1ff0 net/9p/client.c:1031 v9fs_session_init+0x1b9/0x28e0 fs/9p/v9fs.c:410 v9fs_mount+0xe2/0x12b0 fs/9p/vfs_super.c:122 legacy_get_tree+0x114/0x290 fs/fs_context.c:662 vfs_get_tree+0xa7/0x570 fs/super.c:1797 do_new_mount+0x71f/0x15e0 fs/namespace.c:3352 path_mount+0x742/0x1f20 fs/namespace.c:3679 do_mount fs/namespace.c:3692 [inline] __do_sys_mount fs/namespace.c:3898 [inline] __se_sys_mount+0x725/0x810 fs/namespace.c:3875 __x64_sys_mount+0xe4/0x150 fs/namespace.c:3875 do_syscall_64+0xd5/0x1f0 entry_SYSCALL_64_after_hwframe+0x6d/0x75 If p9_check_errors() fails early in p9_client_rpc(), req->rc.tag will not be properly initialized. However, trace_9p_client_res() ends up trying to print it out anyway before p9_client_rpc() finishes. Fix this issue by assigning default values to p9_fcall fields such as 'tag' and (just in case KMSAN unearths something new) 'id' during the tag allocation stage. Reported-and-tested-by: syzbot+ff14db38f56329ef68df@syzkaller.appspotmail.com Fixes: 348b59012e5c ("net/9p: Convert net/9p protocol dumps to tracepoints") Signed-off-by: Nikita Zhandarovich <n.zhandarovich@fintech.ru> Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com> Cc: stable@vger.kernel.org Message-ID: <20240408141039.30428-1-n.zhandarovich@fintech.ru> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16net/ipv6: Fix route deleting failure when metric equals 0xu xin1-1/+4
commit bb487272380d120295e955ad8acfcbb281b57642 upstream. Problem ========= After commit 67f695134703 ("ipv6: Move setting default metric for routes"), we noticed that the logic of assigning the default value of fc_metirc changed in the ioctl process. That is, when users use ioctl(fd, SIOCADDRT, rt) with a non-zero metric to add a route, then they may fail to delete a route with passing in a metric value of 0 to the kernel by ioctl(fd, SIOCDELRT, rt). But iproute can succeed in deleting it. As a reference, when using iproute tools by netlink to delete routes with a metric parameter equals 0, like the command as follows: ip -6 route del fe80::/64 via fe81::5054:ff:fe11:3451 dev eth0 metric 0 the user can still succeed in deleting the route entry with the smallest metric. Root Reason =========== After commit 67f695134703 ("ipv6: Move setting default metric for routes"), When ioctl() pass in SIOCDELRT with a zero metric, rtmsg_to_fib6_config() will set a defalut value (1024) to cfg->fc_metric in kernel, and in ip6_route_del() and the line 4074 at net/ipv3/route.c, it will check by if (cfg->fc_metric && cfg->fc_metric != rt->fib6_metric) continue; and the condition is true and skip the later procedure (deleting route) because cfg->fc_metric != rt->fib6_metric. But before that commit, cfg->fc_metric is still zero there, so the condition is false and it will do the following procedure (deleting). Solution ======== In order to keep a consistent behaviour across netlink() and ioctl(), we should allow to delete a route with a metric value of 0. So we only do the default setting of fc_metric in route adding. CC: stable@vger.kernel.org # 5.4+ Fixes: 67f695134703 ("ipv6: Move setting default metric for routes") Co-developed-by: Fan Yu <fan.yu9@zte.com.cn> Signed-off-by: Fan Yu <fan.yu9@zte.com.cn> Signed-off-by: xu xin <xu.xin16@zte.com.cn> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20240514201102055dD2Ba45qKbLlUMxu_DTHP@zte.com.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16scsi: core: Handle devices which return an unusually large VPD page countMartin K. Petersen1-0/+7
commit d09c05aa35909adb7d29f92f0cd79fdcd1338ef0 upstream. Peter Schneider reported that a system would no longer boot after updating to 6.8.4. Peter bisected the issue and identified commit b5fc07a5fb56 ("scsi: core: Consult supported VPD page list prior to fetching page") as being the culprit. Turns out the enclosure device in Peter's system reports a byteswapped page length for VPD page 0. It reports "02 00" as page length instead of "00 02". This causes us to attempt to access 516 bytes (page length + header) of information despite only 2 pages being present. Limit the page search scope to the size of our VPD buffer to guard against devices returning a larger page count than requested. Link: https://lore.kernel.org/r/20240521023040.2703884-1-martin.petersen@oracle.com Fixes: b5fc07a5fb56 ("scsi: core: Consult supported VPD page list prior to fetching page") Cc: stable@vger.kernel.org Reported-by: Peter Schneider <pschneider1968@googlemail.com> Closes: https://lore.kernel.org/all/eec6ebbf-061b-4a7b-96dc-ea748aa4d035@googlemail.com/ Tested-by: Peter Schneider <pschneider1968@googlemail.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16HID: i2c-hid: elan: fix reset suspend current leakageJohan Hovold1-12/+47
commit 0eafc58f2194dbd01d4be40f99a697681171995b upstream. The Elan eKTH5015M touch controller found on the Lenovo ThinkPad X13s shares the VCC33 supply with other peripherals that may remain powered during suspend (e.g. when enabled as wakeup sources). The reset line is also wired so that it can be left deasserted when the supply is off. This is important as it avoids holding the controller in reset for extended periods of time when it remains powered, which can lead to increased power consumption, and also avoids leaking current through the X13s reset circuitry during suspend (and after driver unbind). Use the new 'no-reset-on-power-off' devicetree property to determine when reset needs to be asserted on power down. Notably this also avoids wasting power on machine variants without a touchscreen for which the driver would otherwise exit probe with reset asserted. Fixes: bd3cba00dcc6 ("HID: i2c-hid: elan: Add support for Elan eKTH6915 i2c-hid touchscreens") Cc: <stable@vger.kernel.org> # 6.0 Cc: Douglas Anderson <dianders@chromium.org> Tested-by: Steev Klimaszewski <steev@kali.org> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Reviewed-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20240507144821.12275-5-johan+linaro@kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16i2c: acpi: Unbind mux adapters before deleteHamish Martin1-4/+15
commit 3f858bbf04dbac934ac279aaee05d49eb9910051 upstream. There is an issue with ACPI overlay table removal specifically related to I2C multiplexers. Consider an ACPI SSDT Overlay that defines a PCA9548 I2C mux on an existing I2C bus. When this table is loaded we see the creation of a device for the overall PCA9548 chip and 8 further devices - one i2c_adapter each for the mux channels. These are all bound to their ACPI equivalents via an eventual invocation of acpi_bind_one(). When we unload the SSDT overlay we run into the problem. The ACPI devices are deleted as normal via acpi_device_del_work_fn() and the acpi_device_del_list. However, the following warning and stack trace is output as the deletion does not go smoothly: ------------[ cut here ]------------ kernfs: can not remove 'physical_node', no directory WARNING: CPU: 1 PID: 11 at fs/kernfs/dir.c:1674 kernfs_remove_by_name_ns+0xb9/0xc0 Modules linked in: CPU: 1 PID: 11 Comm: kworker/u128:0 Not tainted 6.8.0-rc6+ #1 Hardware name: congatec AG conga-B7E3/conga-B7E3, BIOS 5.13 05/16/2023 Workqueue: kacpi_hotplug acpi_device_del_work_fn RIP: 0010:kernfs_remove_by_name_ns+0xb9/0xc0 Code: e4 00 48 89 ef e8 07 71 db ff 5b b8 fe ff ff ff 5d 41 5c 41 5d e9 a7 55 e4 00 0f 0b eb a6 48 c7 c7 f0 38 0d 9d e8 97 0a d5 ff <0f> 0b eb dc 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 RSP: 0018:ffff9f864008fb28 EFLAGS: 00010286 RAX: 0000000000000000 RBX: ffff8ef90a8d4940 RCX: 0000000000000000 RDX: ffff8f000e267d10 RSI: ffff8f000e25c780 RDI: ffff8f000e25c780 RBP: ffff8ef9186f9870 R08: 0000000000013ffb R09: 00000000ffffbfff R10: 00000000ffffbfff R11: ffff8f000e0a0000 R12: ffff9f864008fb50 R13: ffff8ef90c93dd60 R14: ffff8ef9010d0958 R15: ffff8ef9186f98c8 FS: 0000000000000000(0000) GS:ffff8f000e240000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f48f5253a08 CR3: 00000003cb82e000 CR4: 00000000003506f0 Call Trace: <TASK> ? kernfs_remove_by_name_ns+0xb9/0xc0 ? __warn+0x7c/0x130 ? kernfs_remove_by_name_ns+0xb9/0xc0 ? report_bug+0x171/0x1a0 ? handle_bug+0x3c/0x70 ? exc_invalid_op+0x17/0x70 ? asm_exc_invalid_op+0x1a/0x20 ? kernfs_remove_by_name_ns+0xb9/0xc0 ? kernfs_remove_by_name_ns+0xb9/0xc0 acpi_unbind_one+0x108/0x180 device_del+0x18b/0x490 ? srso_return_thunk+0x5/0x5f ? srso_return_thunk+0x5/0x5f device_unregister+0xd/0x30 i2c_del_adapter.part.0+0x1bf/0x250 i2c_mux_del_adapters+0xa1/0xe0 i2c_device_remove+0x1e/0x80 device_release_driver_internal+0x19a/0x200 bus_remove_device+0xbf/0x100 device_del+0x157/0x490 ? __pfx_device_match_fwnode+0x10/0x10 ? srso_return_thunk+0x5/0x5f device_unregister+0xd/0x30 i2c_acpi_notify+0x10f/0x140 notifier_call_chain+0x58/0xd0 blocking_notifier_call_chain+0x3a/0x60 acpi_device_del_work_fn+0x85/0x1d0 process_one_work+0x134/0x2f0 worker_thread+0x2f0/0x410 ? __pfx_worker_thread+0x10/0x10 kthread+0xe3/0x110 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x2f/0x50 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1b/0x30 </TASK> ---[ end trace 0000000000000000 ]--- ... repeated 7 more times, 1 for each channel of the mux ... The issue is that the binding of the ACPI devices to their peer I2C adapters is not correctly cleaned up. Digging deeper into the issue we see that the deletion order is such that the ACPI devices matching the mux channel i2c adapters are deleted first during the SSDT overlay removal. For each of the channels we see a call to i2c_acpi_notify() with ACPI_RECONFIG_DEVICE_REMOVE but, because these devices are not actually i2c_clients, nothing is done for them. Later on, after each of the mux channels has been dealt with, we come to delete the i2c_client representing the PCA9548 device. This is the call stack we see above, whereby the kernel cleans up the i2c_client including destruction of the mux and its channel adapters. At this point we do attempt to unbind from the ACPI peers but those peers no longer exist and so we hit the kernfs errors. The fix is to augment i2c_acpi_notify() to handle i2c_adapters. But, given that the life cycle of the adapters is linked to the i2c_client, instead of deleting the i2c_adapters during the i2c_acpi_notify(), we just trigger unbinding of the ACPI device from the adapter device, and allow the clean up of the adapter to continue in the way it always has. Signed-off-by: Hamish Martin <hamish.martin@alliedtelesis.co.nz> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by: Andi Shyti <andi.shyti@kernel.org> Fixes: 525e6fabeae2 ("i2c / ACPI: add support for ACPI reconfigure notifications") Cc: <stable@vger.kernel.org> # v4.8+ Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>