summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2024-05-30Bluetooth: compute LE flow credits based on recvbuf spaceSebastian Urban1-1/+10
[ Upstream commit ce60b9231b66710b6ee24042ded26efee120ecfc ] Previously LE flow credits were returned to the sender even if the socket's receive buffer was full. This meant that no back-pressure was applied to the sender, thus it continued to send data, resulting in data loss without any error being reported. Furthermore, the amount of credits was essentially fixed to a small amount, leading to reduced performance. This is fixed by computing the number of returned LE flow credits based on the estimated available space in the receive buffer of an L2CAP socket. Consequently, if the receive buffer is full, no credits are returned until the buffer is read and thus cleared by user-space. Since the computation of available receive buffer space can only be performed approximately (due to sk_buff overhead) and the receive buffer size may be changed by user-space after flow credits have been sent, superfluous received data is temporary stored within l2cap_pinfo. This is necessary because Bluetooth LE provides no retransmission mechanism once the data has been acked by the physical layer. If receive buffer space estimation is not possible at the moment, we fall back to providing credits for one full packet as before. This is currently the case during connection setup, when MPS is not yet available. Fixes: b1c325c23d75 ("Bluetooth: Implement returning of LE L2CAP credits") Signed-off-by: Sebastian Urban <surban@surban.net> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-30net: stmmac: move the EST lock to struct stmmac_privXiaolei Wang1-1/+0
[ Upstream commit 36ac9e7f2e5786bd37c5cd91132e1f39c29b8197 ] Reinitialize the whole EST structure would also reset the mutex lock which is embedded in the EST structure, and then trigger the following warning. To address this, move the lock to struct stmmac_priv. We also need to reacquire the mutex lock when doing this initialization. DEBUG_LOCKS_WARN_ON(lock->magic != lock) WARNING: CPU: 3 PID: 505 at kernel/locking/mutex.c:587 __mutex_lock+0xd84/0x1068 Modules linked in: CPU: 3 PID: 505 Comm: tc Not tainted 6.9.0-rc6-00053-g0106679839f7-dirty #29 Hardware name: NXP i.MX8MPlus EVK board (DT) pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : __mutex_lock+0xd84/0x1068 lr : __mutex_lock+0xd84/0x1068 sp : ffffffc0864e3570 x29: ffffffc0864e3570 x28: ffffffc0817bdc78 x27: 0000000000000003 x26: ffffff80c54f1808 x25: ffffff80c9164080 x24: ffffffc080d723ac x23: 0000000000000000 x22: 0000000000000002 x21: 0000000000000000 x20: 0000000000000000 x19: ffffffc083bc3000 x18: ffffffffffffffff x17: ffffffc08117b080 x16: 0000000000000002 x15: ffffff80d2d40000 x14: 00000000000002da x13: ffffff80d2d404b8 x12: ffffffc082b5a5c8 x11: ffffffc082bca680 x10: ffffffc082bb2640 x9 : ffffffc082bb2698 x8 : 0000000000017fe8 x7 : c0000000ffffefff x6 : 0000000000000001 x5 : ffffff8178fe0d48 x4 : 0000000000000000 x3 : 0000000000000027 x2 : ffffff8178fe0d50 x1 : 0000000000000000 x0 : 0000000000000000 Call trace: __mutex_lock+0xd84/0x1068 mutex_lock_nested+0x28/0x34 tc_setup_taprio+0x118/0x68c stmmac_setup_tc+0x50/0xf0 taprio_change+0x868/0xc9c Fixes: b2aae654a479 ("net: stmmac: add mutex lock to protect est parameters") Signed-off-by: Xiaolei Wang <xiaolei.wang@windriver.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Serge Semin <fancer.lancer@gmail.com> Reviewed-by: Andrew Halaney <ahalaney@redhat.com> Link: https://lore.kernel.org/r/20240513014346.1718740-2-xiaolei.wang@windriver.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-30net: stmmac: Offload queueMaxSDU from tc-taprioRohan G Thomas1-0/+1
[ Upstream commit c5c3e1bfc9e0ee72af528df8d773980f4855938a ] Add support for configuring queueMaxSDU. As DWMAC IPs doesn't support queueMaxSDU table handle this in the SW. The maximum 802.3 frame size that is allowed to be transmitted by any queue is queueMaxSDU + 16 bytes (i.e. 6 bytes SA + 6 bytes DA + 4 bytes FCS). Inspired from intel i225 driver. Signed-off-by: Rohan G Thomas <rohan.g.thomas@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Stable-dep-of: 36ac9e7f2e57 ("net: stmmac: move the EST lock to struct stmmac_priv") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-30ax25: Use kernel universal linked list to implement ax25_dev_listDuoming Zhou1-2/+1
[ Upstream commit a7d6e36b9ad052926ba2ecba3a59d8bb67dabcb4 ] The origin ax25_dev_list implements its own single linked list, which is complicated and error-prone. For example, when deleting the node of ax25_dev_list in ax25_dev_device_down(), we have to operate on the head node and other nodes separately. This patch uses kernel universal linked list to replace original ax25_dev_list, which make the operation of ax25_dev_list easier. We should do "dev->ax25_ptr = ax25_dev;" and "dev->ax25_ptr = NULL;" while holding the spinlock, otherwise the ax25_dev_device_up() and ax25_dev_device_down() could race. Suggested-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://lore.kernel.org/r/85bba3af651ca0e1a519da8d0d715b949891171c.1715247018.git.duoming@zju.edu.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org> Stable-dep-of: b505e0319852 ("ax25: Fix reference count leak issues of ax25_dev") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-30net/mlx5: Add a timeout to acquire the command queue semaphoreAkiva Goldberger1-0/+1
[ Upstream commit 485d65e1357123a697c591a5aeb773994b247ad7 ] Prevent forced completion handling on an entry that has not yet been assigned an index, causing an out of bounds access on idx = -22. Instead of waiting indefinitely for the sem, blocking flow now waits for index to be allocated or a sem acquisition timeout before beginning the timer for FW completion. Kernel log example: mlx5_core 0000:06:00.0: wait_func_handle_exec_timeout:1128:(pid 185911): cmd[-22]: CREATE_UCTX(0xa04) No done completion Fixes: 8e715cd613a1 ("net/mlx5: Set command entry semaphore up once got index free") Signed-off-by: Akiva Goldberger <agoldberger@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/r/20240509112951.590184-5-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-30x86/numa: Fix SRAT lookup of CFMWS ranges with numa_fill_memblks()Robert Richter1-6/+1
[ Upstream commit f9f67e5adc8dc2e1cc51ab2d3d6382fa97f074d4 ] For configurations that have the kconfig option NUMA_KEEP_MEMINFO disabled, numa_fill_memblks() only returns with NUMA_NO_MEMBLK (-1). SRAT lookup fails then because an existing SRAT memory range cannot be found for a CFMWS address range. This causes the addition of a duplicate numa_memblk with a different node id and a subsequent page fault and kernel crash during boot. Fix this by making numa_fill_memblks() always available regardless of NUMA_KEEP_MEMINFO. As Dan suggested, the fix is implemented to remove numa_fill_memblks() from sparsemem.h and alos using __weak for the function. Note that the issue was initially introduced with [1]. But since phys_to_target_node() was originally used that returned the valid node 0, an additional numa_memblk was not added. Though, the node id was wrong too, a message is seen then in the logs: kernel/numa.c: pr_info_once("Unknown target node for memory at 0x%llx, assuming node 0\n", [1] commit fd49f99c1809 ("ACPI: NUMA: Add a node and memblk for each CFMWS not in SRAT") Suggested-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/all/66271b0072317_69102944c@dwillia2-xfh.jf.intel.com.notmuch/ Fixes: 8f1004679987 ("ACPI/NUMA: Apply SRAT proximity domain to entire CFMWS window") Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Robert Richter <rrichter@amd.com> Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-30pwm: Provide an inline function to get the parent device of a given chipUwe Kleine-König1-0/+5
[ Upstream commit 4e59267c7a20eb1c1ad9106e682cb6a0d8eb3111 ] Currently a pwm_chip stores in its struct device *dev member a pointer to the parent device. Preparing a change that embeds a full struct device in struct pwm_chip, this accessor function should be used in all drivers directly accessing chip->dev now. This way struct pwm_chip and this new function can be changed without having to touch all drivers in the same change set. Make use of this function in the framework's core sources. Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/cc30090d2f9762bed9854a55612144bccc910781.1707900770.git.u.kleine-koenig@pengutronix.de Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Stable-dep-of: 3e551115aee0 ("pwm: meson: Add check for error from clk_round_rate()") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-30pwm: Drop useless member .of_pwm_n_cells of struct pwm_chipUwe Kleine-König1-2/+0
[ Upstream commit 4e77431cda4973f03d063c47f6ea313dfceebf16 ] Apart from the two of_xlate implementations this member is write-only. In the of_xlate functions of_pwm_xlate_with_flags() and of_pwm_single_xlate() it's more sensible to check for args->args_count because this is what is actually used in the device tree. Acked-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/53d8c545aa8f79a920358be9e72e382b3981bdc4.1704835845.git.u.kleine-koenig@pengutronix.de Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Stable-dep-of: 3e551115aee0 ("pwm: meson: Add check for error from clk_round_rate()") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-30tcp: increase the default TCP scaling ratioHechao Li1-3/+2
[ Upstream commit 697a6c8cec03c2299f850fa50322641a8bf6b915 ] After commit dfa2f0483360 ("tcp: get rid of sysctl_tcp_adv_win_scale"), we noticed an application-level timeout due to reduced throughput. Before the commit, for a client that sets SO_RCVBUF to 65k, it takes around 22 seconds to transfer 10M data. After the commit, it takes 40 seconds. Because our application has a 30-second timeout, this regression broke the application. The reason that it takes longer to transfer data is that tp->scaling_ratio is initialized to a value that results in ~0.25 of rcvbuf. In our case, SO_RCVBUF is set to 65536 by the application, which translates to 2 * 65536 = 131,072 bytes in rcvbuf and hence a ~28k initial receive window. Later, even though the scaling_ratio is updated to a more accurate skb->len/skb->truesize, which is ~0.66 in our environment, the window stays at ~0.25 * rcvbuf. This is because tp->window_clamp does not change together with the tp->scaling_ratio update when autotuning is disabled due to SO_RCVBUF. As a result, the window size is capped at the initial window_clamp, which is also ~0.25 * rcvbuf, and never grows bigger. Most modern applications let the kernel do autotuning, and benefit from the increased scaling_ratio. But there are applications such as kafka that has a default setting of SO_RCVBUF=64k. This patch increases the initial scaling_ratio from ~25% to 50% in order to make it backward compatible with the original default sysctl_tcp_adv_win_scale for applications setting SO_RCVBUF. Fixes: dfa2f0483360 ("tcp: get rid of sysctl_tcp_adv_win_scale") Signed-off-by: Hechao Li <hli@netflix.com> Reviewed-by: Tycho Andersen <tycho@tycho.pizza> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/netdev/20240402215405.432863-1-hli@netflix.com/ Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-30bpf: Pack struct bpf_fib_lookupAnton Protopopov1-1/+1
[ Upstream commit f91717007217d975aa975ddabd91ae1a107b9bff ] The struct bpf_fib_lookup is supposed to be of size 64. A recent commit 59b418c7063d ("bpf: Add a check for struct bpf_fib_lookup size") added a static assertion to check this property so that future changes to the structure will not accidentally break this assumption. As it immediately turned out, on some 32-bit arm systems, when AEABI=n, the total size of the structure was equal to 68, see [1]. This happened because the bpf_fib_lookup structure contains a union of two 16-bit fields: union { __u16 tot_len; __u16 mtu_result; }; which was supposed to compile to a 16-bit-aligned 16-bit field. On the aforementioned setups it was instead both aligned and padded to 32-bits. Declare this inner union as __attribute__((packed, aligned(2))) such that it always is of size 2 and is aligned to 16 bits. [1] https://lore.kernel.org/all/CA+G9fYtsoP51f-oP_Sp5MOq-Ffv8La2RztNpwvE6+R1VtFiLrw@mail.gmail.com/#t Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Fixes: e1850ea9bd9e ("bpf: bpf_fib_lookup return MTU value as output when looked up") Signed-off-by: Anton Protopopov <aspsk@isovalent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240403123303.1452184-1-aspsk@isovalent.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-30bitops: add missing prototype checkAlexander Lobakin1-0/+1
[ Upstream commit 72cc1980a0ef3ccad0d539e7dace63d0d7d432a4 ] Commit 8238b4579866 ("wait_on_bit: add an acquire memory barrier") added a new bitop, test_bit_acquire(), with proper wrapping in order to try to optimize it at compile-time, but missed the list of bitops used for checking their prototypes a bit below. The functions added have consistent prototypes, so that no more changes are required and no functional changes take place. Fixes: 8238b4579866 ("wait_on_bit: add an acquire memory barrier") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-30ACPI: bus: Indicate support for IRQ ResourceSource thru _OSCArmin Wolf1-0/+1
[ Upstream commit 403ad17c06509794fdf6e4d4b3070bd5b56e2a8e ] The ACPI IRQ mapping code supports parsing of ResourceSource, but this is not reported thru _OSC. Fix this by setting bit 13 ("Interrupt ResourceSource support") when evaluating _OSC. Fixes: d44fa3d46079 ("ACPI: Add support for ResourceSource/IRQ domain mapping") Signed-off-by: Armin Wolf <W_Armin@gmx.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-30ACPI: Fix Generic Initiator Affinity _OSC bitArmin Wolf1-1/+1
[ Upstream commit d0d4f1474e36b195eaad477373127ae621334c01 ] The ACPI spec says bit 17 should be used to indicate support for Generic Initiator Affinity Structure in SRAT, but we currently set bit 13 ("Interrupt ResourceSource support"). Fix this by actually setting bit 17 when evaluating _OSC. Fixes: 01aabca2fd54 ("ACPI: Let ACPI know we support Generic Initiator Affinity Structures") Signed-off-by: Armin Wolf <W_Armin@gmx.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-30ACPI: bus: Indicate support for the Generic Event Device thru _OSCArmin Wolf1-0/+1
[ Upstream commit a8a967a243d71dd635ede076020f665a4df51c63 ] A device driver for the Generic Event Device (ACPI0013) already exists for quite some time, but support for it was never reported thru _OSC. Fix this by setting bit 11 ("Generic Event Device support") when evaluating _OSC. Fixes: 3db80c230da1 ("ACPI: implement Generic Event Device") Signed-off-by: Armin Wolf <W_Armin@gmx.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-30ACPI: bus: Indicate support for more than 16 p-states thru _OSCArmin Wolf1-0/+1
[ Upstream commit 6e8345f23ca37d6d41bb76be5d6a705ddf542817 ] The code responsible for parsing the available p-states should have no problems handling more than 16 p-states. Indicate this by setting bit 10 ("Greater Than 16 p-state support") when evaluating _OSC. Signed-off-by: Armin Wolf <W_Armin@gmx.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Stable-dep-of: a8a967a243d7 ("ACPI: bus: Indicate support for the Generic Event Device thru _OSC") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-30ACPI: bus: Indicate support for _TFP thru _OSCArmin Wolf1-0/+1
[ Upstream commit 95d43290f1e476b3be782dd17642e452d0436266 ] The ACPI thermal driver already uses the _TPF ACPI method to retrieve precise sampling time values, but this is not reported thru _OSC. Fix this by setting bit 9 ("Fast Thermal Sampling support") when evaluating _OSC. Fixes: a2ee7581afd5 ("ACPI: thermal: Add Thermal fast Sampling Period (_TFP) support") Signed-off-by: Armin Wolf <W_Armin@gmx.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-30wifi: ieee80211: fix ieee80211_mle_basic_sta_prof_size_ok()Johannes Berg1-1/+1
[ Upstream commit c121514df0daa800cc500dc2738e0b8a1c54af98 ] If there was a possibility of an MLE basic STA profile without subelements, we might reject it because we account for the one octet for sta_info_len twice (it's part of itself, and in the fixed portion). Like in ieee80211_mle_reconf_sta_prof_size_ok, subtract 1 to adjust that. When reading the elements we did take this into account, and since there are always elements, this never really mattered. Fixes: 7b6f08771bf6 ("wifi: ieee80211: Support validating ML station profile length") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Reviewed-by: Ilan Peer <ilan.peer@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20240318184907.00bb0b20ed60.I8c41dd6fc14c4b187ab901dea15ade73c79fb98c@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-30libfs: Add simple_offset_rename() APIChuck Lever1-0/+2
[ Upstream commit 5a1a25be995e1014abd01600479915683e356f5c ] I'm about to fix a tmpfs rename bug that requires the use of internal simple_offset helpers that are not available in mm/shmem.c Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Link: https://lore.kernel.org/r/20240415152057.4605-3-cel@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org> Stable-dep-of: ad191eb6d694 ("shmem: Fix shmem_rename2()") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-30libfs: Convert simple directory offsets to use a Maple TreeChuck Lever1-2/+3
[ Upstream commit 0e4a862174f2a8d1653a8a9cf0815020e1d3af24 ] Test robot reports: > kernel test robot noticed a -19.0% regression of aim9.disk_src.ops_per_sec on: > > commit: a2e459555c5f9da3e619b7e47a63f98574dc75f1 ("shmem: stable directory offsets") > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master Feng Tang further clarifies that: > ... the new simple_offset_add() > called by shmem_mknod() brings extra cost related with slab, > specifically the 'radix_tree_node', which cause the regression. Willy's analysis is that, over time, the test workload causes xa_alloc_cyclic() to fragment the underlying SLAB cache. This patch replaces the offset_ctx's xarray with a Maple Tree in the hope that Maple Tree's dense node mode will handle this scenario more scalably. In addition, we can widen the simple directory offset maximum to signed long (as loff_t is also signed). Suggested-by: Matthew Wilcox <willy@infradead.org> Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202309081306.3ecb3734-oliver.sang@intel.com Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Link: https://lore.kernel.org/r/170820145616.6328.12620992971699079156.stgit@91.116.238.104.host.secureserver.net Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org> Stable-dep-of: 23cdd0eed3f1 ("libfs: Fix simple_offset_rename_exchange()") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-30maple_tree: Add mtree_alloc_cyclic()Chuck Lever1-0/+7
[ Upstream commit 9b6713cc75229f25552c643083cbdbfb771e5bca ] I need a cyclic allocator for the simple_offset implementation in fs/libfs.c. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Link: https://lore.kernel.org/r/170820144179.6328.12838600511394432325.stgit@91.116.238.104.host.secureserver.net Signed-off-by: Christian Brauner <brauner@kernel.org> Stable-dep-of: 23cdd0eed3f1 ("libfs: Fix simple_offset_rename_exchange()") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-30libfs: Add simple_offset_empty()Chuck Lever1-0/+1
[ Upstream commit ecba88a3b32d733d41e27973e25b2bc580f64281 ] For simple filesystems that use directory offset mapping, rely strictly on the directory offset map to tell when a directory has no children. After this patch is applied, the emptiness test holds only the RCU read lock when the directory being tested has no children. In addition, this adds another layer of confirmation that simple_offset_add/remove() are working as expected. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Link: https://lore.kernel.org/r/170820143463.6328.7872919188371286951.stgit@91.116.238.104.host.secureserver.net Signed-off-by: Christian Brauner <brauner@kernel.org> Stable-dep-of: 23cdd0eed3f1 ("libfs: Fix simple_offset_rename_exchange()") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-30cpu: Ignore "mitigations" kernel parameter if CPU_MITIGATIONS=nSean Christopherson1-0/+11
[ Upstream commit ce0abef6a1d540acef85068e0e82bdf1fbeeb0e9 ] Explicitly disallow enabling mitigations at runtime for kernels that were built with CONFIG_CPU_MITIGATIONS=n, as some architectures may omit code entirely if mitigations are disabled at compile time. E.g. on x86, a large pile of Kconfigs are buried behind CPU_MITIGATIONS, and trying to provide sane behavior for retroactively enabling mitigations is extremely difficult, bordering on impossible. E.g. page table isolation and call depth tracking require build-time support, BHI mitigations will still be off without additional kernel parameters, etc. [ bp: Touchups. ] Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20240420000556.2645001-3-seanjc@google.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-30wifi: mac80211: don't use rate mask for scanningJohannes Berg1-0/+3
[ Upstream commit ab9177d83c040eba58387914077ebca56f14fae6 ] The rate mask is intended for use during operation, and can be set to only have masks for the currently active band. As such, it cannot be used for scanning which can be on other bands as well. Simply ignore the rate masks during scanning to avoid warnings from incorrect settings. Reported-by: syzbot+fdc5123366fb9c3fdc6d@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=fdc5123366fb9c3fdc6d Co-developed-by: Dmitry Antipov <dmantipov@yandex.ru> Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Tested-by: Dmitry Antipov <dmantipov@yandex.ru> Link: https://msgid.link/20240326220854.9594cbb418ca.I7f86c0ba1f98cf7e27c2bacf6c2d417200ecea5c@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-25block: add a disk_has_partscan helperChristoph Hellwig1-0/+13
commit 140ce28dd3bee8e53acc27f123ae474d69ef66f0 upstream. Add a helper to check if partition scanning is enabled instead of open coding the check in a few places. This now always checks for the hidden flag even if all but one of the callers are never reachable for hidden gendisks. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20240502130033.1958492-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-25Bluetooth: L2CAP: Fix div-by-zero in l2cap_le_flowctl_init()Sungwoo Kim2-0/+10
commit a5b862c6a221459d54e494e88965b48dcfa6cc44 upstream. l2cap_le_flowctl_init() can cause both div-by-zero and an integer overflow since hdev->le_mtu may not fall in the valid range. Move MTU from hci_dev to hci_conn to validate MTU and stop the connection process earlier if MTU is invalid. Also, add a missing validation in read_buffer_size() and make it return an error value if the validation fails. Now hci_conn_add() returns ERR_PTR() as it can fail due to the both a kzalloc failure and invalid MTU value. divide error: 0000 [#1] PREEMPT SMP KASAN NOPTI CPU: 0 PID: 67 Comm: kworker/u5:0 Tainted: G W 6.9.0-rc5+ #20 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 Workqueue: hci0 hci_rx_work RIP: 0010:l2cap_le_flowctl_init+0x19e/0x3f0 net/bluetooth/l2cap_core.c:547 Code: e8 17 17 0c 00 66 41 89 9f 84 00 00 00 bf 01 00 00 00 41 b8 02 00 00 00 4c 89 fe 4c 89 e2 89 d9 e8 27 17 0c 00 44 89 f0 31 d2 <66> f7 f3 89 c3 ff c3 4d 8d b7 88 00 00 00 4c 89 f0 48 c1 e8 03 42 RSP: 0018:ffff88810bc0f858 EFLAGS: 00010246 RAX: 00000000000002a0 RBX: 0000000000000000 RCX: dffffc0000000000 RDX: 0000000000000000 RSI: ffff88810bc0f7c0 RDI: ffffc90002dcb66f RBP: ffff88810bc0f880 R08: aa69db2dda70ff01 R09: 0000ffaaaaaaaaaa R10: 0084000000ffaaaa R11: 0000000000000000 R12: ffff88810d65a084 R13: dffffc0000000000 R14: 00000000000002a0 R15: ffff88810d65a000 FS: 0000000000000000(0000) GS:ffff88811ac00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020000100 CR3: 0000000103268003 CR4: 0000000000770ef0 PKRU: 55555554 Call Trace: <TASK> l2cap_le_connect_req net/bluetooth/l2cap_core.c:4902 [inline] l2cap_le_sig_cmd net/bluetooth/l2cap_core.c:5420 [inline] l2cap_le_sig_channel net/bluetooth/l2cap_core.c:5486 [inline] l2cap_recv_frame+0xe59d/0x11710 net/bluetooth/l2cap_core.c:6809 l2cap_recv_acldata+0x544/0x10a0 net/bluetooth/l2cap_core.c:7506 hci_acldata_packet net/bluetooth/hci_core.c:3939 [inline] hci_rx_work+0x5e5/0xb20 net/bluetooth/hci_core.c:4176 process_one_work kernel/workqueue.c:3254 [inline] process_scheduled_works+0x90f/0x1530 kernel/workqueue.c:3335 worker_thread+0x926/0xe70 kernel/workqueue.c:3416 kthread+0x2e3/0x380 kernel/kthread.c:388 ret_from_fork+0x5c/0x90 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 </TASK> Modules linked in: ---[ end trace 0000000000000000 ]--- Fixes: 6ed58ec520ad ("Bluetooth: Use LE buffers for LE traffic") Suggested-by: Luiz Augusto von Dentz <luiz.dentz@gmail.com> Signed-off-by: Sungwoo Kim <iam@sung-woo.kim> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-17VFIO: Add the SPR_DSA and SPR_IAX devices to the denylistArjan van de Ven1-0/+2
commit 95feb3160eef0caa6018e175a5560b816aee8e79 upstream. Due to an erratum with the SPR_DSA and SPR_IAX devices, it is not secure to assign these devices to virtual machines. Add the PCI IDs of these devices to the VFIO denylist to ensure that this is handled appropriately by the VFIO subsystem. The SPR_DSA and SPR_IAX devices are on-SOC devices for the Sapphire Rapids (and related) family of products that perform data movement and compression. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-17kmsan: compiler_types: declare __no_sanitize_or_inlineAlexander Potapenko1-0/+11
commit 90d1f14cbb9ddbfc532e2da13bf6e0ed8320e792 upstream. It turned out that KMSAN instruments READ_ONCE_NOCHECK(), resulting in false positive reports, because __no_sanitize_or_inline enforced inlining. Properly declare __no_sanitize_or_inline under __SANITIZE_MEMORY__, so that it does not __always_inline the annotated function. Link: https://lkml.kernel.org/r/20240426091622.3846771-1-glider@google.com Fixes: 5de0ce85f5a4 ("kmsan: mark noinstr as __no_sanitize_memory") Signed-off-by: Alexander Potapenko <glider@google.com> Reported-by: syzbot+355c5bb8c1445c871ee8@syzkaller.appspotmail.com Link: https://lkml.kernel.org/r/000000000000826ac1061675b0e3@google.com Cc: <stable@vger.kernel.org> Reviewed-by: Marco Elver <elver@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-17mm/slab: make __free(kfree) accept error pointersDan Carpenter1-2/+2
commit cd7eb8f83fcf258f71e293f7fc52a70be8ed0128 upstream. Currently, if an automatically freed allocation is an error pointer that will lead to a crash. An example of this is in wm831x_gpio_dbg_show(). 171 char *label __free(kfree) = gpiochip_dup_line_label(chip, i); 172 if (IS_ERR(label)) { 173 dev_err(wm831x->dev, "Failed to duplicate label\n"); 174 continue; 175 } The auto clean up function should check for error pointers as well, otherwise we're going to keep hitting issues like this. Fixes: 54da6a092431 ("locking: Introduce __cleanup() based infrastructure") Cc: <stable@vger.kernel.org> Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-17Reapply "drm/qxl: simplify qxl_fence_wait"Linus Torvalds1-7/+0
commit 3628e0383dd349f02f882e612ab6184e4bb3dc10 upstream. This reverts commit 07ed11afb68d94eadd4ffc082b97c2331307c5ea. Stephen Rostedt reports: "I went to run my tests on my VMs and the tests hung on boot up. Unfortunately, the most I ever got out was: [ 93.607888] Testing event system initcall: OK [ 93.667730] Running tests on all trace events: [ 93.669757] Testing all events: OK [ 95.631064] ------------[ cut here ]------------ Timed out after 60 seconds" and further debugging points to a possible circular locking dependency between the console_owner locking and the worker pool locking. Reverting the commit allows Steve's VM to boot to completion again. [ This may obviously result in the "[TTM] Buffer eviction failed" messages again, which was the reason for that original revert. But at this point this seems preferable to a non-booting system... ] Reported-and-bisected-by: Steven Rostedt <rostedt@goodmis.org> Link: https://lore.kernel.org/all/20240502081641.457aa25f@gandalf.local.home/ Acked-by: Maxime Ripard <mripard@kernel.org> Cc: Alex Constantino <dreaming.about.electric.sheep@gmail.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: Timo Lindfors <timo.lindfors@iki.fi> Cc: Dave Airlie <airlied@redhat.com> Cc: Gerd Hoffmann <kraxel@redhat.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Daniel Vetter <daniel@ffwll.ch> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-17rxrpc: Fix the names of the fields in the ACK trailer structDavid Howells1-1/+1
[ Upstream commit 17469ae0582aaacad36e8e858f58b86c369f21ef ] From AFS-3.3 a trailer containing extra info was added to the ACK packet format - but AF_RXRPC has the names of some of the fields mixed up compared to other AFS implementations. Rename the struct and the fields to make them match. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: "David S. Miller" <davem@davemloft.net> cc: Eric Dumazet <edumazet@google.com> cc: Jakub Kicinski <kuba@kernel.org> cc: Paolo Abeni <pabeni@redhat.com> cc: linux-afs@lists.infradead.org cc: netdev@vger.kernel.org Stable-dep-of: ba4e103848d3 ("rxrpc: Fix congestion control algorithm") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-17xfrm: Preserve vlan tags for transport mode software GROPaul Davey2-0/+18
[ Upstream commit 58fbfecab965014b6e3cc956a76b4a96265a1add ] The software GRO path for esp transport mode uses skb_mac_header_rebuild prior to re-injecting the packet via the xfrm_napi_dev. This only copies skb->mac_len bytes of header which may not be sufficient if the packet contains 802.1Q tags or other VLAN tags. Worse copying only the initial header will leave a packet marked as being VLAN tagged but without the corresponding tag leading to mangling when it is later untagged. The VLAN tags are important when receiving the decrypted esp transport mode packet after GRO processing to ensure it is received on the correct interface. Therefore record the full mac header length in xfrm*_transport_input for later use in corresponding xfrm*_transport_finish to copy the entire mac header when rebuilding the mac header for GRO. The skb->data pointer is left pointing skb->mac_header bytes after the start of the mac header as is expected by the network stack and network and transport header offsets reset to this location. Fixes: 7785bba299a8 ("esp: Add a software GRO codepath") Signed-off-by: Paul Davey <paul.davey@alliedtelesis.co.nz> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-17Drivers: hv: vmbus: Track decrypted status in vmbus_gpadlRick Edgecombe1-0/+1
[ Upstream commit 211f514ebf1ef5de37b1cf6df9d28a56cfd242ca ] In CoCo VMs it is possible for the untrusted host to cause set_memory_encrypted() or set_memory_decrypted() to fail such that an error is returned and the resulting memory is shared. Callers need to take care to handle these errors to avoid returning decrypted (shared) memory to the page allocator, which could lead to functional or security issues. In order to make sure callers of vmbus_establish_gpadl() and vmbus_teardown_gpadl() don't return decrypted/shared pages to allocators, add a field in struct vmbus_gpadl to keep track of the decryption status of the buffers. This will allow the callers to know if they should free or leak the pages. Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Signed-off-by: Michael Kelley <mhklinux@outlook.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Link: https://lore.kernel.org/r/20240311161558.1310-3-mhklinux@outlook.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20240311161558.1310-3-mhklinux@outlook.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-17net: add copy_safe_from_sockptr() helperEric Dumazet1-0/+25
[ Upstream commit 6309863b31dd80317cd7d6824820b44e254e2a9c ] copy_from_sockptr() helper is unsafe, unless callers did the prior check against user provided optlen. Too many callers get this wrong, lets add a helper to fix them and avoid future copy/paste bugs. Instead of : if (optlen < sizeof(opt)) { err = -EINVAL; break; } if (copy_from_sockptr(&opt, optval, sizeof(opt)) { err = -EFAULT; break; } Use : err = copy_safe_from_sockptr(&opt, sizeof(opt), optval, optlen); if (err) break; Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20240408082845.3957374-2-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-17memblock tests: fix undefined reference to `BIT'Wei Yang1-0/+2
[ Upstream commit 592447f6cb3c20d606d6c5d8e6af68e99707b786 ] commit 772dd0342727 ("mm: enumerate all gfp flags") define gfp flags with the help of BIT, while gfp_types.h doesn't include header file for the definition. This through an error on building memblock tests. Let's include linux/bits.h to fix it. Signed-off-by: Wei Yang <richard.weiyang@gmail.com> CC: Suren Baghdasaryan <surenb@google.com> CC: Michal Hocko <mhocko@suse.com> Link: https://lore.kernel.org/r/20240402132701.29744-4-richard.weiyang@gmail.com Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-17drm/amdkfd: range check cp bad op exception interruptsJonathan Kim1-3/+14
[ Upstream commit 0cac183b98d8a8c692c98e8dba37df15a9e9210d ] Due to a CP interrupt bug, bad packet garbage exception codes are raised. Do a range check so that the debugger and runtime do not receive garbage codes. Update the user api to guard exception code type checking as well. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Tested-by: Jesse Zhang <jesse.zhang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-17scsi: mpi3mr: Avoid memcpy field-spanning write WARNINGShin'ichiro Kawasaki1-1/+1
[ Upstream commit 429846b4b6ce9853e0d803a2357bb2e55083adf0 ] When the "storcli2 show" command is executed for eHBA-9600, mpi3mr driver prints this WARNING message: memcpy: detected field-spanning write (size 128) of single field "bsg_reply_buf->reply_buf" at drivers/scsi/mpi3mr/mpi3mr_app.c:1658 (size 1) WARNING: CPU: 0 PID: 12760 at drivers/scsi/mpi3mr/mpi3mr_app.c:1658 mpi3mr_bsg_request+0x6b12/0x7f10 [mpi3mr] The cause of the WARN is 128 bytes memcpy to the 1 byte size array "__u8 replay_buf[1]" in the struct mpi3mr_bsg_in_reply_buf. The array is intended to be a flexible length array, so the WARN is a false positive. To suppress the WARN, remove the constant number '1' from the array declaration and clarify that it has flexible length. Also, adjust the memory allocation size to match the change. Suggested-by: Sathya Prakash Veerichetty <sathya.prakash@broadcom.com> Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Link: https://lore.kernel.org/r/20240323084155.166835-1-shinichiro.kawasaki@wdc.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-17net: gro: fix udp bad offset in socket lookup by adding ↵Richard Gobert1-0/+9
{inner_}network_offset to napi_gro_cb [ Upstream commit 5ef31ea5d053a8f493a772ebad3f3ce82c35d845 ] Commits a602456 ("udp: Add GRO functions to UDP socket") and 57c67ff ("udp: additional GRO support") introduce incorrect usage of {ip,ipv6}_hdr in the complete phase of gro. The functions always return skb->network_header, which in the case of encapsulated packets at the gro complete phase, is always set to the innermost L3 of the packet. That means that calling {ip,ipv6}_hdr for skbs which completed the GRO receive phase (both in gro_list and *_gro_complete) when parsing an encapsulated packet's _outer_ L3/L4 may return an unexpected value. This incorrect usage leads to a bug in GRO's UDP socket lookup. udp{4,6}_lib_lookup_skb functions use ip_hdr/ipv6_hdr respectively. These *_hdr functions return network_header which will point to the innermost L3, resulting in the wrong offset being used in __udp{4,6}_lib_lookup with encapsulated packets. This patch adds network_offset and inner_network_offset to napi_gro_cb, and makes sure both are set correctly. To fix the issue, network_offsets union is used inside napi_gro_cb, in which both the outer and the inner network offsets are saved. Reproduction example: Endpoint configuration example (fou + local address bind) # ip fou add port 6666 ipproto 4 # ip link add name tun1 type ipip remote 2.2.2.1 local 2.2.2.2 encap fou encap-dport 5555 encap-sport 6666 mode ipip # ip link set tun1 up # ip a add 1.1.1.2/24 dev tun1 Netperf TCP_STREAM result on net-next before patch is applied: net-next main, GRO enabled: $ netperf -H 1.1.1.2 -t TCP_STREAM -l 5 Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 131072 16384 16384 5.28 2.37 net-next main, GRO disabled: $ netperf -H 1.1.1.2 -t TCP_STREAM -l 5 Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 131072 16384 16384 5.01 2745.06 patch applied, GRO enabled: $ netperf -H 1.1.1.2 -t TCP_STREAM -l 5 Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 131072 16384 16384 5.01 2877.38 Fixes: a6024562ffd7 ("udp: Add GRO functions to UDP socket") Signed-off-by: Richard Gobert <richardbgobert@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-17ALSA: emu10k1: move the whole GPIO event handling to the workqueueOswald Buddenhagen1-2/+1
[ Upstream commit f848337cd801c7106a4ec0d61765771dab2a5909 ] The actual event processing was already done by workqueue items. We can move the event dispatching there as well, rather than doing it already in the interrupt handler callback. This change has a rather profound "side effect" on the reliability of the FPGA programming: once we enter programming mode, we must not issue any snd_emu1010_fpga_{read,write}() calls until we're done, as these would badly mess up the programming protocol. But exactly that would happen when trying to program the dock, as that triggers GPIO interrupts as a side effect. This is mitigated by deferring the actual interrupt handling, as workqueue items are not re-entrant. To avoid scheduling the dispatcher on non-events, we now explicitly ignore GPIO IRQs triggered by "uninteresting" pins, which happens a lot as a side effect of calling snd_emu1010_fpga_{read,write}(). Fixes: fbb64eedf5a3 ("ALSA: emu10k1: make E-MU dock monitoring interrupt-driven") Link: https://bugzilla.kernel.org/show_bug.cgi?id=218584 Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de> Signed-off-by: Takashi Iwai <tiwai@suse.de> Message-ID: <20240428093716.3198666-4-oswald.buddenhagen@gmx.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-17regmap: Add regmap_read_bypassed()Richard Fitzgerald1-0/+8
[ Upstream commit 70ee853eec5693fefd8348a2b049d9cb83362e58 ] Add a regmap_read_bypassed() to allow reads from the hardware registers while the regmap is in cache-only mode. A typical use for this is to keep the cache in cache-only mode until the hardware has reached a valid state, but one or more status registers must be polled to determine when this state is reached. For example, firmware download on the cs35l56 can take several seconds if there are multiple amps sharing limited bus bandwidth. This is too long to block in probe() so it is done as a background task. The device must be soft-reset to reboot the firmware and during this time the registers are not accessible, so the cache should be in cache-only. But the driver must poll a register to detect when reboot has completed. Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com> Fixes: 8a731fd37f8b ("ASoC: cs35l56: Move utility functions to shared file") Link: https://msgid.link/r/20240408101803.43183-2-rf@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-17bpf, skmsg: Fix NULL pointer dereference in sk_psock_skb_ingress_enqueueJason Xing1-0/+2
[ Upstream commit 6648e613226e18897231ab5e42ffc29e63fa3365 ] Fix NULL pointer data-races in sk_psock_skb_ingress_enqueue() which syzbot reported [1]. [1] BUG: KCSAN: data-race in sk_psock_drop / sk_psock_skb_ingress_enqueue write to 0xffff88814b3278b8 of 8 bytes by task 10724 on cpu 1: sk_psock_stop_verdict net/core/skmsg.c:1257 [inline] sk_psock_drop+0x13e/0x1f0 net/core/skmsg.c:843 sk_psock_put include/linux/skmsg.h:459 [inline] sock_map_close+0x1a7/0x260 net/core/sock_map.c:1648 unix_release+0x4b/0x80 net/unix/af_unix.c:1048 __sock_release net/socket.c:659 [inline] sock_close+0x68/0x150 net/socket.c:1421 __fput+0x2c1/0x660 fs/file_table.c:422 __fput_sync+0x44/0x60 fs/file_table.c:507 __do_sys_close fs/open.c:1556 [inline] __se_sys_close+0x101/0x1b0 fs/open.c:1541 __x64_sys_close+0x1f/0x30 fs/open.c:1541 do_syscall_64+0xd3/0x1d0 entry_SYSCALL_64_after_hwframe+0x6d/0x75 read to 0xffff88814b3278b8 of 8 bytes by task 10713 on cpu 0: sk_psock_data_ready include/linux/skmsg.h:464 [inline] sk_psock_skb_ingress_enqueue+0x32d/0x390 net/core/skmsg.c:555 sk_psock_skb_ingress_self+0x185/0x1e0 net/core/skmsg.c:606 sk_psock_verdict_apply net/core/skmsg.c:1008 [inline] sk_psock_verdict_recv+0x3e4/0x4a0 net/core/skmsg.c:1202 unix_read_skb net/unix/af_unix.c:2546 [inline] unix_stream_read_skb+0x9e/0xf0 net/unix/af_unix.c:2682 sk_psock_verdict_data_ready+0x77/0x220 net/core/skmsg.c:1223 unix_stream_sendmsg+0x527/0x860 net/unix/af_unix.c:2339 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg+0x140/0x180 net/socket.c:745 ____sys_sendmsg+0x312/0x410 net/socket.c:2584 ___sys_sendmsg net/socket.c:2638 [inline] __sys_sendmsg+0x1e9/0x280 net/socket.c:2667 __do_sys_sendmsg net/socket.c:2676 [inline] __se_sys_sendmsg net/socket.c:2674 [inline] __x64_sys_sendmsg+0x46/0x50 net/socket.c:2674 do_syscall_64+0xd3/0x1d0 entry_SYSCALL_64_after_hwframe+0x6d/0x75 value changed: 0xffffffff83d7feb0 -> 0x0000000000000000 Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 10713 Comm: syz-executor.4 Tainted: G W 6.8.0-syzkaller-08951-gfe46a7dd189e #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/29/2024 Prior to this, commit 4cd12c6065df ("bpf, sockmap: Fix NULL pointer dereference in sk_psock_verdict_data_ready()") fixed one NULL pointer similarly due to no protection of saved_data_ready. Here is another different caller causing the same issue because of the same reason. So we should protect it with sk_callback_lock read lock because the writer side in the sk_psock_drop() uses "write_lock_bh(&sk->sk_callback_lock);". To avoid errors that could happen in future, I move those two pairs of lock into the sk_psock_data_ready(), which is suggested by John Fastabend. Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface") Reported-by: syzbot+aa8c8ec2538929f18f2d@syzkaller.appspotmail.com Signed-off-by: Jason Xing <kernelxing@tencent.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Closes: https://syzkaller.appspot.com/bug?extid=aa8c8ec2538929f18f2d Link: https://lore.kernel.org/all/20240329134037.92124-1-kerneljasonxing@gmail.com Link: https://lore.kernel.org/bpf/20240404021001.94815-1-kerneljasonxing@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-17regulator: change devm_regulator_get_enable_optional() stub to return OkMatti Vaittinen1-1/+1
[ Upstream commit ff33132605c1a0acea59e4c523cb7c6fabe856b2 ] The devm_regulator_get_enable_optional() should be a 'call and forget' API, meaning, when it is used to enable the regulators, the API does not provide a handle to do any further control of the regulators. It gives no real benefit to return an error from the stub if CONFIG_REGULATOR is not set. On the contrary, returning an error is causing problems to drivers when hardware is such it works out just fine with no regulator control. Returning an error forces drivers to specifically handle the case where CONFIG_REGULATOR is not set, making the mere existence of the stub questionalble. Change the stub implementation for the devm_regulator_get_enable_optional() to return Ok so drivers do not separately handle the case where the CONFIG_REGULATOR is not set. Signed-off-by: Matti Vaittinen <mazziesaccount@gmail.com> Fixes: da279e6965b3 ("regulator: Add devm helpers for get and enable") Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/ZiedtOE00Zozd3XO@fedora Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-17regulator: change stubbed devm_regulator_get_enable to return OkMatti Vaittinen1-1/+1
[ Upstream commit 96e20adc43c4f81e9163a5188cee75a6dd393e09 ] The devm_regulator_get_enable() should be a 'call and forget' API, meaning, when it is used to enable the regulators, the API does not provide a handle to do any further control of the regulators. It gives no real benefit to return an error from the stub if CONFIG_REGULATOR is not set. On the contrary, returning and error is causing problems to drivers when hardware is such it works out just fine with no regulator control. Returning an error forces drivers to specifically handle the case where CONFIG_REGULATOR is not set, making the mere existence of the stub questionalble. Furthermore, the stub of the regulator_enable() seems to be returning Ok. Change the stub implementation for the devm_regulator_get_enable() to return Ok so drivers do not separately handle the case where the CONFIG_REGULATOR is not set. Signed-off-by: Matti Vaittinen <mazziesaccount@gmail.com> Reported-by: Aleksander Mazur <deweloper@wp.pl> Suggested-by: Guenter Roeck <linux@roeck-us.net> Fixes: da279e6965b3 ("regulator: Add devm helpers for get and enable") Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/ZiYF6d1V1vSPcsJS@drtxq0yyyyyyyyyyyyyby-3.rev.dnainternet.fi Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-17sunrpc: add a struct rpc_stats arg to rpc_create_argsJosef Bacik1-0/+1
[ Upstream commit 2057a48d0dd00c6a2a94ded7df2bf1d3f2a4a0da ] We want to be able to have our rpc stats handled in a per network namespace manner, so add an option to rpc_create_args to specify a different rpc_stats struct instead of using the one on the rpc_program. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Stable-dep-of: 24457f1be29f ("nfs: Handle error of rpc_proc_register() in nfs_net_init().") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-02mm: turn folio_test_hugetlb into a PageTypeMatthew Wilcox (Oracle)2-37/+34
commit d99e3140a4d33e26066183ff727d8f02f56bec64 upstream. The current folio_test_hugetlb() can be fooled by a concurrent folio split into returning true for a folio which has never belonged to hugetlbfs. This can't happen if the caller holds a refcount on it, but we have a few places (memory-failure, compaction, procfs) which do not and should not take a speculative reference. Since hugetlb pages do not use individual page mapcounts (they are always fully mapped and use the entire_mapcount field to record the number of mappings), the PageType field is available now that page_mapcount() ignores the value in this field. In compaction and with CONFIG_DEBUG_VM enabled, the current implementation can result in an oops, as reported by Luis. This happens since 9c5ccf2db04b ("mm: remove HUGETLB_PAGE_DTOR") effectively added some VM_BUG_ON() checks in the PageHuge() testing path. [willy@infradead.org: update vmcoreinfo] Link: https://lkml.kernel.org/r/ZgGZUvsdhaT1Va-T@casper.infradead.org Link: https://lkml.kernel.org/r/20240321142448.1645400-6-willy@infradead.org Fixes: 9c5ccf2db04b ("mm: remove HUGETLB_PAGE_DTOR") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: David Hildenbrand <david@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Reported-by: Luis Chamberlain <mcgrof@kernel.org> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218227 Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Oscar Salvador <osalvador@suse.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-02firmware: qcom: uefisecapp: Fix memory related IO errors and crashesMaximilian Luz2-9/+56
commit ed09f81eeaa8f9265e1787282cb283f10285c259 upstream. It turns out that while the QSEECOM APP_SEND command has specific fields for request and response buffers, uefisecapp expects them both to be in a single memory region. Failure to adhere to this has (so far) resulted in either no response being written to the response buffer (causing an EIO to be emitted down the line), the SCM call to fail with EINVAL (i.e., directly from TZ/firmware), or the device to be hard-reset. While this issue can be triggered deterministically, in the current form it seems to happen rather sporadically (which is why it has gone unnoticed during earlier testing). This is likely due to the two kzalloc() calls (for request and response) being directly after each other. Which means that those likely return consecutive regions most of the time, especially when not much else is going on in the system. Fix this by allocating a single memory region for both request and response buffers, properly aligning both structs inside it. This unfortunately also means that the qcom_scm_qseecom_app_send() interface needs to be restructured, as it should no longer map the DMA regions separately. Therefore, move the responsibility of DMA allocation (or mapping) to the caller. Fixes: 759e7a2b62eb ("firmware: Add support for Qualcomm UEFI Secure Application") Cc: stable@vger.kernel.org # 6.7 Tested-by: Johan Hovold <johan+linaro@kernel.org> Reviewed-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com> Tested-by: Konrad Dybcio <konrad.dybcio@linaro.org> # X13s Link: https://lore.kernel.org/r/20240406130125.1047436-1-luzmaximilian@gmail.com Signed-off-by: Bjorn Andersson <andersson@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-02macsec: Enable devices to advertise whether they update sk_buff md_dst ↵Rahul Rameshbabu1-0/+2
during offloads commit 475747a19316b08e856c666a20503e73d7ed67ed upstream. Cannot know whether a Rx skb missing md_dst is intended for MACsec or not without knowing whether the device is able to update this field during an offload. Assume that an offload to a MACsec device cannot support updating md_dst by default. Capable devices can advertise that they do indicate that an skb is related to a MACsec offloaded packet using the md_dst. Cc: Sabrina Dubroca <sd@queasysnail.net> Cc: stable@vger.kernel.org Fixes: 860ead89b851 ("net/macsec: Add MACsec skb_metadata_dst Rx Data path support") Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Reviewed-by: Benjamin Poirier <bpoirier@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://lore.kernel.org/r/20240423181319.115860-2-rrameshbabu@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-02ethernet: Add helper for assigning packet type when dest address does not ↵Rahul Rameshbabu1-0/+25
match device address commit 6e159fd653d7ebf6290358e0330a0cb8a75cf73b upstream. Enable reuse of logic in eth_type_trans for determining packet type. Suggested-by: Sabrina Dubroca <sd@queasysnail.net> Cc: stable@vger.kernel.org Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://lore.kernel.org/r/20240423181319.115860-3-rrameshbabu@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-02mm: support page_mapcount() on page_has_type() pagesMatthew Wilcox (Oracle)2-5/+7
commit fd1a745ce03e37945674c14833870a9af0882e2d upstream. Return 0 for pages which can't be mapped. This matches how page_mapped() works. It is more convenient for users to not have to filter out these pages. Link: https://lkml.kernel.org/r/20240321142448.1645400-5-willy@infradead.org Fixes: 9c5ccf2db04b ("mm: remove HUGETLB_PAGE_DTOR") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: David Hildenbrand <david@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Oscar Salvador <osalvador@suse.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-02mm: create FOLIO_FLAG_FALSE and FOLIO_TYPE_OPS macrosMatthew Wilcox (Oracle)1-23/+47
commit 12bbaae7635a56049779db3bef6e7140d9aa5f67 upstream. Following the separation of FOLIO_FLAGS from PAGEFLAGS, separate FOLIO_FLAG_FALSE from PAGEFLAG_FALSE and FOLIO_TYPE_OPS from PAGE_TYPE_OPS. Link: https://lkml.kernel.org/r/20240321142448.1645400-3-willy@infradead.org Fixes: 9c5ccf2db04b ("mm: remove HUGETLB_PAGE_DTOR") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: David Hildenbrand <david@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Oscar Salvador <osalvador@suse.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-02drm: add drm_gem_object_is_shared_for_memory_stats() helperAlex Deucher1-0/+13
[ Upstream commit b31f5eba32ae8cc28e7cfa5a55ec8670d8c718e2 ] Add a helper so that drm drivers can consistently report shared status via the fdinfo shared memory stats interface. In addition to handle count, show buffers as shared if they are shared via dma-buf as well (e.g., shared with v4l or some other subsystem). v2: switch to inline function Link: https://lore.kernel.org/all/20231207180225.439482-1-alexander.deucher@amd.com/ Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> (v1) Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.keonig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Stable-dep-of: a6ff969fe9cb ("drm/amdgpu: fix visible VRAM handling during faults") Signed-off-by: Sasha Levin <sashal@kernel.org>