summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-06-05octeontx2-af: Always allocate PF entries from low prioriy zoneSubbaraya Sundeep1-11/+22
PF mcam entries has to be at low priority always so that VF can install longest prefix match rules at higher priority. This was taken care currently but when priority allocation wrt reference entry is requested then entries are allocated from mid-zone instead of low priority zone. Fix this and always allocate entries from low priority zone for PFs. Fixes: 7df5b4b260dd ("octeontx2-af: Allocate low priority entries for PF") Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-04net: tls: fix marking packets as decryptedJakub Kicinski1-0/+3
For TLS offload we mark packets with skb->decrypted to make sure they don't escape the host without getting encrypted first. The crypto state lives in the socket, so it may get detached by a call to skb_orphan(). As a safety check - the egress path drops all packets with skb->decrypted and no "crypto-safe" socket. The skb marking was added to sendpage only (and not sendmsg), because tls_device injected data into the TCP stack using sendpage. This special case was missed when sendpage got folded into sendmsg. Fixes: c5c37af6ecad ("tcp: Convert do_tcp_sendpages() to use MSG_SPLICE_PAGES") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20240530232607.82686-1-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-04Merge tag 'wireless-2024-06-03' of ↵Jakub Kicinski41-166/+322
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless Kalle Valo says: ==================== wireless fixes for v6.10-rc3 The first fixes for v6.10. And we have a big one, I suspect the biggest wireless pull request we ever had. There are fixes all over, both in stack and drivers. Likely the most important here are mt76 not working on mt7615 devices, ath11k not being able to connect to 6 GHz networks and rtlwifi suffering from packet loss. But of course there's much more. * tag 'wireless-2024-06-03' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: (37 commits) wifi: rtlwifi: Ignore IEEE80211_CONF_CHANGE_RETRY_LIMITS wifi: mt76: mt7615: add missing chanctx ops wifi: wilc1000: document SRCU usage instead of SRCU Revert "wifi: wilc1000: set atomic flag on kmemdup in srcu critical section" Revert "wifi: wilc1000: convert list management to RCU" wifi: mac80211: fix UBSAN noise in ieee80211_prep_hw_scan() wifi: mac80211: correctly parse Spatial Reuse Parameter Set element wifi: mac80211: fix Spatial Reuse element size check wifi: iwlwifi: mvm: don't read past the mfuart notifcation wifi: iwlwifi: mvm: Fix scan abort handling with HW rfkill wifi: iwlwifi: mvm: check n_ssids before accessing the ssids wifi: iwlwifi: mvm: properly set 6 GHz channel direct probe option wifi: iwlwifi: mvm: handle BA session teardown in RF-kill wifi: iwlwifi: mvm: Handle BIGTK cipher in kek_kck cmd wifi: iwlwifi: mvm: remove stale STA link data during restart wifi: iwlwifi: dbg_ini: move iwl_dbg_tlv_free outside of debugfs ifdef wifi: iwlwifi: mvm: set properly mac header wifi: iwlwifi: mvm: revert gen2 TX A-MPDU size to 64 wifi: iwlwifi: mvm: d3: fix WoWLAN command version lookup wifi: iwlwifi: mvm: fix a crash on 7265 ... ==================== Link: https://lore.kernel.org/r/20240603115129.9494CC2BD10@smtp.kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-04lib/test_rhashtable: add missing MODULE_DESCRIPTION() macroJeff Johnson1-0/+1
make allmodconfig && make W=1 C=1 reports: WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_rhashtable.o Add the missing invocation of the MODULE_DESCRIPTION() macro. Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com> Link: https://lore.kernel.org/r/20240531-md-lib-test_rhashtable-v1-1-cd6d4138f1b6@quicinc.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-04Merge branch 'dst_cache-fix-possible-races'Jakub Kicinski5-21/+24
Eric Dumazet says: ==================== dst_cache: fix possible races This series is inspired by various undisclosed syzbot reports hinting at corruptions in dst_cache structures. It seems at least four users of dst_cache are racy against BH reentrancy. Last patch is adding a DEBUG_NET check to catch future misuses. ==================== Link: https://lore.kernel.org/r/20240531132636.2637995-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-04net: dst_cache: add two DEBUG_NET warningsEric Dumazet1-0/+2
After fixing four different bugs involving dst_cache users, it might be worth adding a check about BH being blocked by dst_cache callers. DEBUG_NET_WARN_ON_ONCE(!in_softirq()); It is not fatal, if we missed valid case where no BH deadlock is to be feared, we might change this. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/20240531132636.2637995-6-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-04ila: block BH in ila_output()Eric Dumazet1-1/+6
As explained in commit 1378817486d6 ("tipc: block BH before using dst_cache"), net/core/dst_cache.c helpers need to be called with BH disabled. ila_output() is called from lwtunnel_output() possibly from process context, and under rcu_read_lock(). We might be interrupted by a softirq, re-enter ila_output() and corrupt dst_cache data structures. Fix the race by using local_bh_disable(). Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/20240531132636.2637995-5-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-04ipv6: sr: block BH in seg6_output_core() and seg6_input_core()Eric Dumazet1-8/+6
As explained in commit 1378817486d6 ("tipc: block BH before using dst_cache"), net/core/dst_cache.c helpers need to be called with BH disabled. Disabling preemption in seg6_output_core() is not good enough, because seg6_output_core() is called from process context, lwtunnel_output() only uses rcu_read_lock(). We might be interrupted by a softirq, re-enter seg6_output_core() and corrupt dst_cache data structures. Fix the race by using local_bh_disable() instead of preempt_disable(). Apply a similar change in seg6_input_core(). Fixes: fa79581ea66c ("ipv6: sr: fix several BUGs when preemption is enabled") Fixes: 6c8702c60b88 ("ipv6: sr: add support for SRH encapsulation and injection with lwtunnels") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: David Lebrun <dlebrun@google.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/20240531132636.2637995-4-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-04net: ipv6: rpl_iptunnel: block BH in rpl_output() and rpl_input()Eric Dumazet1-8/+6
As explained in commit 1378817486d6 ("tipc: block BH before using dst_cache"), net/core/dst_cache.c helpers need to be called with BH disabled. Disabling preemption in rpl_output() is not good enough, because rpl_output() is called from process context, lwtunnel_output() only uses rcu_read_lock(). We might be interrupted by a softirq, re-enter rpl_output() and corrupt dst_cache data structures. Fix the race by using local_bh_disable() instead of preempt_disable(). Apply a similar change in rpl_input(). Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Alexander Aring <aahringo@redhat.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/20240531132636.2637995-3-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-04ipv6: ioam: block BH from ioam6_output()Eric Dumazet1-4/+4
As explained in commit 1378817486d6 ("tipc: block BH before using dst_cache"), net/core/dst_cache.c helpers need to be called with BH disabled. Disabling preemption in ioam6_output() is not good enough, because ioam6_output() is called from process context, lwtunnel_output() only uses rcu_read_lock(). We might be interrupted by a softirq, re-enter ioam6_output() and corrupt dst_cache data structures. Fix the race by using local_bh_disable() instead of preempt_disable(). Fixes: 8cb3bf8bff3c ("ipv6: ioam: Add support for the ip6ip6 encapsulation") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Justin Iurman <justin.iurman@uliege.be> Acked-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/20240531132636.2637995-2-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-04vmxnet3: disable rx data ring on dma allocation failureMatthias Stocker1-1/+1
When vmxnet3_rq_create() fails to allocate memory for rq->data_ring.base, the subsequent call to vmxnet3_rq_destroy_all_rxdataring does not reset rq->data_ring.desc_size for the data ring that failed, which presumably causes the hypervisor to reference it on packet reception. To fix this bug, rq->data_ring.desc_size needs to be set to 0 to tell the hypervisor to disable this feature. [ 95.436876] kernel BUG at net/core/skbuff.c:207! [ 95.439074] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI [ 95.440411] CPU: 7 PID: 0 Comm: swapper/7 Not tainted 6.9.3-dirty #1 [ 95.441558] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 12/12/2018 [ 95.443481] RIP: 0010:skb_panic+0x4d/0x4f [ 95.444404] Code: 4f 70 50 8b 87 c0 00 00 00 50 8b 87 bc 00 00 00 50 ff b7 d0 00 00 00 4c 8b 8f c8 00 00 00 48 c7 c7 68 e8 be 9f e8 63 58 f9 ff <0f> 0b 48 8b 14 24 48 c7 c1 d0 73 65 9f e8 a1 ff ff ff 48 8b 14 24 [ 95.447684] RSP: 0018:ffffa13340274dd0 EFLAGS: 00010246 [ 95.448762] RAX: 0000000000000089 RBX: ffff8fbbc72b02d0 RCX: 000000000000083f [ 95.450148] RDX: 0000000000000000 RSI: 00000000000000f6 RDI: 000000000000083f [ 95.451520] RBP: 000000000000002d R08: 0000000000000000 R09: ffffa13340274c60 [ 95.452886] R10: ffffffffa04ed468 R11: 0000000000000002 R12: 0000000000000000 [ 95.454293] R13: ffff8fbbdab3c2d0 R14: ffff8fbbdbd829e0 R15: ffff8fbbdbd809e0 [ 95.455682] FS: 0000000000000000(0000) GS:ffff8fbeefd80000(0000) knlGS:0000000000000000 [ 95.457178] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 95.458340] CR2: 00007fd0d1f650c8 CR3: 0000000115f28000 CR4: 00000000000406f0 [ 95.459791] Call Trace: [ 95.460515] <IRQ> [ 95.461180] ? __die_body.cold+0x19/0x27 [ 95.462150] ? die+0x2e/0x50 [ 95.462976] ? do_trap+0xca/0x110 [ 95.463973] ? do_error_trap+0x6a/0x90 [ 95.464966] ? skb_panic+0x4d/0x4f [ 95.465901] ? exc_invalid_op+0x50/0x70 [ 95.466849] ? skb_panic+0x4d/0x4f [ 95.467718] ? asm_exc_invalid_op+0x1a/0x20 [ 95.468758] ? skb_panic+0x4d/0x4f [ 95.469655] skb_put.cold+0x10/0x10 [ 95.470573] vmxnet3_rq_rx_complete+0x862/0x11e0 [vmxnet3] [ 95.471853] vmxnet3_poll_rx_only+0x36/0xb0 [vmxnet3] [ 95.473185] __napi_poll+0x2b/0x160 [ 95.474145] net_rx_action+0x2c6/0x3b0 [ 95.475115] handle_softirqs+0xe7/0x2a0 [ 95.476122] __irq_exit_rcu+0x97/0xb0 [ 95.477109] common_interrupt+0x85/0xa0 [ 95.478102] </IRQ> [ 95.478846] <TASK> [ 95.479603] asm_common_interrupt+0x26/0x40 [ 95.480657] RIP: 0010:pv_native_safe_halt+0xf/0x20 [ 95.481801] Code: 22 d7 e9 54 87 01 00 0f 1f 40 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa eb 07 0f 00 2d 93 ba 3b 00 fb f4 <e9> 2c 87 01 00 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 [ 95.485563] RSP: 0018:ffffa133400ffe58 EFLAGS: 00000246 [ 95.486882] RAX: 0000000000004000 RBX: ffff8fbbc1d14064 RCX: 0000000000000000 [ 95.488477] RDX: ffff8fbeefd80000 RSI: ffff8fbbc1d14000 RDI: 0000000000000001 [ 95.490067] RBP: ffff8fbbc1d14064 R08: ffffffffa0652260 R09: 00000000000010d3 [ 95.491683] R10: 0000000000000018 R11: ffff8fbeefdb4764 R12: ffffffffa0652260 [ 95.493389] R13: ffffffffa06522e0 R14: 0000000000000001 R15: 0000000000000000 [ 95.495035] acpi_safe_halt+0x14/0x20 [ 95.496127] acpi_idle_do_entry+0x2f/0x50 [ 95.497221] acpi_idle_enter+0x7f/0xd0 [ 95.498272] cpuidle_enter_state+0x81/0x420 [ 95.499375] cpuidle_enter+0x2d/0x40 [ 95.500400] do_idle+0x1e5/0x240 [ 95.501385] cpu_startup_entry+0x29/0x30 [ 95.502422] start_secondary+0x11c/0x140 [ 95.503454] common_startup_64+0x13e/0x141 [ 95.504466] </TASK> [ 95.505197] Modules linked in: nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill ip_set nf_tables vsock_loopback vmw_vsock_virtio_transport_common qrtr vmw_vsock_vmci_transport vsock sunrpc binfmt_misc pktcdvd vmw_balloon pcspkr vmw_vmci i2c_piix4 joydev loop dm_multipath nfnetlink zram crct10dif_pclmul crc32_pclmul vmwgfx crc32c_intel polyval_clmulni polyval_generic ghash_clmulni_intel sha512_ssse3 sha256_ssse3 vmxnet3 sha1_ssse3 drm_ttm_helper vmw_pvscsi ttm ata_generic pata_acpi serio_raw scsi_dh_rdac scsi_dh_emc scsi_dh_alua ip6_tables ip_tables fuse [ 95.516536] ---[ end trace 0000000000000000 ]--- Fixes: 6f4833383e85 ("net: vmxnet3: Fix NULL pointer dereference in vmxnet3_rq_rx_complete()") Signed-off-by: Matthias Stocker <mstocker@barracuda.com> Reviewed-by: Subbaraya Sundeep <sbhatta@marvell.com> Reviewed-by: Ronak Doshi <ronak.doshi@broadcom.com> Link: https://lore.kernel.org/r/20240531103711.101961-1-mstocker@barracuda.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-03net: phy: micrel: fix KSZ9477 PHY issues after suspend/resumeTristram Ha1-6/+56
When the PHY is powered up after powered down most of the registers are reset, so the PHY setup code needs to be done again. In addition the interrupt register will need to be setup again so that link status indication works again. Fixes: 26dd2974c5b5 ("net: phy: micrel: Move KSZ9477 errata fixes to PHY driver") Signed-off-by: Tristram Ha <tristram.ha@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-02net/tcp: Don't consider TCP_CLOSE in TCP_AO_ESTABLISHEDDmitry Safonov2-7/+13
TCP_CLOSE may or may not have current/rnext keys and should not be considered "established". The fast-path for TCP_CLOSE is SKB_DROP_REASON_TCP_CLOSE. This is what tcp_rcv_state_process() does anyways. Add an early drop path to not spend any time verifying segment signatures for sockets in TCP_CLOSE state. Cc: stable@vger.kernel.org # v6.7 Fixes: 0a3a809089eb ("net/tcp: Verify inbound TCP-AO signed segments") Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com> Link: https://lore.kernel.org/r/20240529-tcp_ao-sk_state-v1-1-d69b5d323c52@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-02net/ncsi: Fix the multi thread manner of NCSI driverDelphineCCChiu3-38/+41
Currently NCSI driver will send several NCSI commands back to back without waiting the response of previous NCSI command or timeout in some state when NIC have multi channel. This operation against the single thread manner defined by NCSI SPEC(section 6.3.2.3 in DSP0222_1.1.1) According to NCSI SPEC(section 6.2.13.1 in DSP0222_1.1.1), we should probe one channel at a time by sending NCSI commands (Clear initial state, Get version ID, Get capabilities...), than repeat this steps until the max number of channels which we got from NCSI command (Get capabilities) has been probed. Fixes: e6f44ed6d04d ("net/ncsi: Package and channel management") Signed-off-by: DelphineCCChiu <delphine_cc_chiu@wiwynn.com> Link: https://lore.kernel.org/r/20240529065856.825241-1-delphine_cc_chiu@wiwynn.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-02net: rps: fix error when CONFIG_RFS_ACCEL is offJason Xing1-1/+2
John Sperbeck reported that if we turn off CONFIG_RFS_ACCEL, the 'head' is not defined, which will trigger compile error. So I move the 'head' out of the CONFIG_RFS_ACCEL scope. Fixes: 84b6823cd96b ("net: rps: protect last_qtail with rps_input_queue_tail_save() helper") Reported-by: John Sperbeck <jsperbeck@google.com> Closes: https://lore.kernel.org/all/20240529203421.2432481-1-jsperbeck@google.com/ Signed-off-by: Jason Xing <kernelxing@tencent.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20240530032717.57787-1-kerneljasonxing@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-02ax25: Replace kfree() in ax25_dev_free() with ax25_dev_put()Duoming Zhou1-1/+1
The object "ax25_dev" is managed by reference counting. Thus it should not be directly released by kfree(), replace with ax25_dev_put(). Fixes: d01ffb9eee4a ("ax25: add refcount in ax25_dev to avoid UAF bugs") Suggested-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://lore.kernel.org/r/20240530051733.11416-1-duoming@zju.edu.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-02ax25: Fix refcount imbalance on inbound connectionsLars Kellogg-Stedman1-0/+6
When releasing a socket in ax25_release(), we call netdev_put() to decrease the refcount on the associated ax.25 device. However, the execution path for accepting an incoming connection never calls netdev_hold(). This imbalance leads to refcount errors, and ultimately to kernel crashes. A typical call trace for the above situation will start with one of the following errors: refcount_t: decrement hit 0; leaking memory. refcount_t: underflow; use-after-free. And will then have a trace like: Call Trace: <TASK> ? show_regs+0x64/0x70 ? __warn+0x83/0x120 ? refcount_warn_saturate+0xb2/0x100 ? report_bug+0x158/0x190 ? prb_read_valid+0x20/0x30 ? handle_bug+0x3e/0x70 ? exc_invalid_op+0x1c/0x70 ? asm_exc_invalid_op+0x1f/0x30 ? refcount_warn_saturate+0xb2/0x100 ? refcount_warn_saturate+0xb2/0x100 ax25_release+0x2ad/0x360 __sock_release+0x35/0xa0 sock_close+0x19/0x20 [...] On reboot (or any attempt to remove the interface), the kernel gets stuck in an infinite loop: unregister_netdevice: waiting for ax0 to become free. Usage count = 0 This patch corrects these issues by ensuring that we call netdev_hold() and ax25_dev_hold() for new connections in ax25_accept(). This makes the logic leading to ax25_accept() match the logic for ax25_bind(): in both cases we increment the refcount, which is ultimately decremented in ax25_release(). Fixes: 9fd75b66b8f6 ("ax25: Fix refcount leaks caused by ax25_cb_del()") Signed-off-by: Lars Kellogg-Stedman <lars@oddbit.com> Tested-by: Duoming Zhou <duoming@zju.edu.cn> Tested-by: Dan Cross <crossd@gmail.com> Tested-by: Chris Maness <christopher.maness@gmail.com> Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://lore.kernel.org/r/20240529210242.3346844-2-lars@oddbit.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-02Merge branch 'virtio_net-fix-lock-warning-and-unrecoverable-state'Jakub Kicinski1-21/+17
Heng Qi says: ==================== virtio_net: fix lock warning and unrecoverable state Patch 1 describes and fixes an issue where dim cannot return to normal state in certain scenarios. Patch 2 attempts to resolve lockdep's complaints that holding many nested locks. ==================== Link: https://lore.kernel.org/r/20240528134116.117426-1-hengqi@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-02virtio_net: fix a spurious deadlock issueHeng Qi1-20/+16
When the following snippet is run, lockdep will report a deadlock[1]. /* Acquire all queues dim_locks */ for (i = 0; i < vi->max_queue_pairs; i++) mutex_lock(&vi->rq[i].dim_lock); There's no deadlock here because the vq locks are always taken in the same order, but lockdep can not figure it out. So refactoring the code to alleviate the problem. [1] ======================================================== WARNING: possible recursive locking detected 6.9.0-rc7+ #319 Not tainted -------------------------------------------- ethtool/962 is trying to acquire lock: but task is already holding lock: other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&vi->rq[i].dim_lock); lock(&vi->rq[i].dim_lock); *** DEADLOCK *** May be due to missing lock nesting notation 3 locks held by ethtool/962: #0: ffffffff82dbaab0 (cb_lock){++++}-{3:3}, at: genl_rcv+0x19/0x40 #1: ffffffff82dad0a8 (rtnl_mutex){+.+.}-{3:3}, at: ethnl_default_set_doit+0xbe/0x1e0 stack backtrace: CPU: 6 PID: 962 Comm: ethtool Not tainted 6.9.0-rc7+ #319 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x79/0xb0 check_deadlock+0x130/0x220 __lock_acquire+0x861/0x990 lock_acquire.part.0+0x72/0x1d0 ? lock_acquire+0xf8/0x130 __mutex_lock+0x71/0xd50 virtnet_set_coalesce+0x151/0x190 __ethnl_set_coalesce.isra.0+0x3f8/0x4d0 ethnl_set_coalesce+0x34/0x90 ethnl_default_set_doit+0xdd/0x1e0 genl_family_rcv_msg_doit+0xdc/0x130 genl_family_rcv_msg+0x154/0x230 ? __pfx_ethnl_default_set_doit+0x10/0x10 genl_rcv_msg+0x4b/0xa0 ? __pfx_genl_rcv_msg+0x10/0x10 netlink_rcv_skb+0x5a/0x110 genl_rcv+0x28/0x40 netlink_unicast+0x1af/0x280 netlink_sendmsg+0x20e/0x460 __sys_sendto+0x1fe/0x210 ? find_held_lock+0x2b/0x80 ? do_user_addr_fault+0x3a2/0x8a0 ? __lock_release+0x5e/0x160 ? do_user_addr_fault+0x3a2/0x8a0 ? lock_release+0x72/0x140 ? do_user_addr_fault+0x3a7/0x8a0 __x64_sys_sendto+0x29/0x30 do_syscall_64+0x78/0x180 entry_SYSCALL_64_after_hwframe+0x76/0x7e Fixes: 4d4ac2ececd3 ("virtio_net: Add a lock for per queue RX coalesce") Signed-off-by: Heng Qi <hengqi@linux.alibaba.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20240528134116.117426-3-hengqi@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-02virtio_net: fix possible dim status unrecoverableHeng Qi1-1/+1
When the dim worker is scheduled, if it no longer needs to issue commands, dim may not be able to return to the working state later. For example, the following single queue scenario: 1. The dim worker of rxq0 is scheduled, and the dim status is changed to DIM_APPLY_NEW_PROFILE; 2. dim is disabled or parameters have not been modified; 3. virtnet_rx_dim_work exits directly; Then, even if net_dim is invoked again, it cannot work because the state is not restored to DIM_START_MEASURE. Fixes: 6208799553a8 ("virtio-net: support rx netdim") Signed-off-by: Heng Qi <hengqi@linux.alibaba.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Link: https://lore.kernel.org/r/20240528134116.117426-2-hengqi@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-02ethtool: init tsinfo stats if requestedVadim Fedorenko1-3/+3
Statistic values should be set to ETHTOOL_STAT_NOT_SET even if the device doesn't support statistics. Otherwise zeros will be returned as if they are proper values: host# ethtool -I -T lo Time stamping parameters for lo: Capabilities: software-transmit software-receive software-system-clock PTP Hardware Clock: none Hardware Transmit Timestamp Modes: none Hardware Receive Filter Modes: none Statistics: tx_pkts: 0 tx_lost: 0 tx_err: 0 Fixes: 0e9c127729be ("ethtool: add interface to read Tx hardware timestamping statistics") Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Vadim Fedorenko <vadfed@meta.com> Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Link: https://lore.kernel.org/r/20240530040814.1014446-1-vadfed@meta.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-02MAINTAINERS: remove Peter GeisPeter Geis1-1/+0
The Motorcomm PHY driver is now maintained by the OEM. The driver has expanded far beyond my original purpose, and I do not have the hardware to test against the new portions of it. Therefore I am removing myself as a maintainer of the driver. Signed-off-by: Peter Geis <pgwipeout@gmail.com> Link: https://lore.kernel.org/r/20240529185635.538072-1-pgwipeout@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-02virtio_net: fix missing lock protection on control_buf accessHeng Qi1-1/+3
Refactored the handling of control_buf to be within the cvq_lock critical section, mitigating race conditions between reading device responses and new command submissions. Fixes: 6f45ab3e0409 ("virtio_net: Add a lock for the command VQ.") Signed-off-by: Heng Qi <hengqi@linux.alibaba.com> Reviewed-by: Hariprasad Kelam <hkelam@marvell.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Link: https://lore.kernel.org/r/20240530034143.19579-1-hengqi@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-01wifi: rtlwifi: Ignore IEEE80211_CONF_CHANGE_RETRY_LIMITSBitterblue Smith1-15/+0
Since commit 0a44dfc07074 ("wifi: mac80211: simplify non-chanctx drivers") ieee80211_hw_config() is no longer called with changed = ~0. rtlwifi relied on ~0 in order to ignore the default retry limits of 4/7, preferring 48/48 in station mode and 7/7 in AP/IBSS. RTL8192DU has a lot of packet loss with the default limits from mac80211. Fix it by ignoring IEEE80211_CONF_CHANGE_RETRY_LIMITS completely, because it's the simplest solution. Link: https://lore.kernel.org/linux-wireless/cedd13d7691f4692b2a2fa5a24d44a22@realtek.com/ Cc: stable@vger.kernel.org # 6.9.x Signed-off-by: Bitterblue Smith <rtl8821cerfe2@gmail.com> Acked-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://msgid.link/1fabb8e4-adf3-47ae-8462-8aea963bc2a5@gmail.com
2024-06-01wifi: mt76: mt7615: add missing chanctx opsJohannes Berg1-0/+4
Here's another one I missed during the initial conversion, fix that. Cc: stable@vger.kernel.org Reported-by: Rene Petersen <renepetersen@posteo.de> Fixes: 0a44dfc07074 ("wifi: mac80211: simplify non-chanctx drivers") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218895 Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://msgid.link/20240528142308.3f7db1821e68.I531135d7ad76331a50244d6d5288e14aa9668390@changeid
2024-06-01wifi: wilc1000: document SRCU usage instead of SRCUAlexis Lothoré1-0/+7
Commit f236464f1db7 ("wifi: wilc1000: convert list management to RCU") attempted to convert SRCU to RCU usage, assuming it was not really needed. The runtime issues that arose after merging it showed that there are code paths involving sleeping functions, and removing those would need some heavier driver rework. Add some documentation about SRCU need to make sure that any future developer do not miss some use cases if tempted to convert back again to RCU. Signed-off-by: Alexis Lothoré <alexis.lothore@bootlin.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://msgid.link/20240528-wilc_revert_srcu_to_rcu-v1-3-bce096e0798c@bootlin.com
2024-06-01Revert "wifi: wilc1000: set atomic flag on kmemdup in srcu critical section"Alexis Lothoré1-1/+1
This reverts commit 35aee01ff43d7eb6c2caa0b94e7cc6c45baeeab7 Commit 35aee01ff43d ("wifi: wilc1000: set atomic flag on kmemdup in srcu critical section") was preparatory to the SRCU to RCU conversion done by commit f236464f1db7 ("wifi: wilc1000: convert list management to RCU"). This conversion brought issues and so has been reverted, so the atomic flag is not needed anymore. Signed-off-by: Alexis Lothoré <alexis.lothore@bootlin.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://msgid.link/20240528-wilc_revert_srcu_to_rcu-v1-2-bce096e0798c@bootlin.com
2024-06-01Revert "wifi: wilc1000: convert list management to RCU"Alexis Lothoré5-45/+64
This reverts commit f236464f1db7bea80075e6e31ac70dc6eb80547f Commit f236464f1db7 ("wifi: wilc1000: convert list management to RCU") replaced SRCU with RCU, aiming to simplify RCU usage in the driver. No documentation or commit history hinted about why SRCU has been preferred in original design, so it has been assumed to be safe to do this conversion. Unfortunately, some static analyzers raised warnings, confirmed by runtime checker, not long after the merge. At least three different issues arose when switching to RCU: - wilc_wlan_txq_filter_dup_tcp_ack is executed in a RCU read critical section yet calls wait_for_completion_timeout - wilc_wfi_init_mon_interface calls kmalloc and register_netdevice while manipulating a vif retrieved from vif list - set_channel sends command to chip (and so, also waits for a completion) while holding a vif retrieved from vif list (so, in RCU read critical section) Some of those issues are not trivial to fix and would need bigger driver rework. Fix those issues by reverting the SRCU to RCU conversion commit Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/linux-wireless/3b46ec7c-baee-49fd-b760-3bc12fb12eaf@moroto.mountain/ Fixes: f236464f1db7 ("wifi: wilc1000: convert list management to RCU") Signed-off-by: Alexis Lothoré <alexis.lothore@bootlin.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://msgid.link/20240528-wilc_revert_srcu_to_rcu-v1-1-bce096e0798c@bootlin.com
2024-06-01Merge tag 'ath-current-20240531' of ↵Kalle Valo4-22/+44
git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath Merge ath-current from git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git ath.git fixes for 6.10. Two fixes for user reported regressions in ath11k. One dependency fix and one error path fix.
2024-05-30Merge tag 'net-6.10-rc2' of ↵Linus Torvalds78-381/+1179
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from bpf and netfilter. Current release - regressions: - gro: initialize network_offset in network layer - tcp: reduce accepted window in NEW_SYN_RECV state Current release - new code bugs: - eth: mlx5e: do not use ptp structure for tx ts stats when not initialized - eth: ice: check for unregistering correct number of devlink params Previous releases - regressions: - bpf: Allow delete from sockmap/sockhash only if update is allowed - sched: taprio: extend minimum interval restriction to entire cycle too - netfilter: ipset: add list flush to cancel_gc - ipv4: fix address dump when IPv4 is disabled on an interface - sock_map: avoid race between sock_map_close and sk_psock_put - eth: mlx5: use mlx5_ipsec_rx_status_destroy to correctly delete status rules Previous releases - always broken: - core: fix __dst_negative_advice() race - bpf: - fix multi-uprobe PID filtering logic - fix pkt_type override upon netkit pass verdict - netfilter: tproxy: bail out if IP has been disabled on the device - af_unix: annotate data-race around unix_sk(sk)->addr - eth: mlx5e: fix UDP GSO for encapsulated packets - eth: idpf: don't enable NAPI and interrupts prior to allocating Rx buffers - eth: i40e: fully suspend and resume IO operations in EEH case - eth: octeontx2-pf: free send queue buffers incase of leaf to inner - eth: ipvlan: dont Use skb->sk in ipvlan_process_v{4,6}_outbound" * tag 'net-6.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (69 commits) netdev: add qstat for csum complete ipvlan: Dont Use skb->sk in ipvlan_process_v{4,6}_outbound net: ena: Fix redundant device NUMA node override ice: check for unregistering correct number of devlink params ice: fix 200G PHY types to link speed mapping i40e: Fully suspend and resume IO operations in EEH case i40e: factoring out i40e_suspend/i40e_resume e1000e: move force SMBUS near the end of enable_ulp function net: dsa: microchip: fix RGMII error in KSZ DSA driver ipv4: correctly iterate over the target netns in inet_dump_ifaddr() net: fix __dst_negative_advice() race nfc/nci: Add the inconsistency check between the input data length and count MAINTAINERS: dwmac: starfive: update Maintainer net/sched: taprio: extend minimum interval restriction to entire cycle too net/sched: taprio: make q->picos_per_byte available to fill_sched_entry() netfilter: nft_fib: allow from forward/input without iif selector netfilter: tproxy: bail out if IP has been disabled on the device netfilter: nft_payload: skbuff vlan metadata mangle support net: ti: icssg-prueth: Fix start counter for ft1 filter sock_map: avoid race between sock_map_close and sk_psock_put ...
2024-05-30netdev: add qstat for csum completeJakub Kicinski3-0/+6
Recent commit 0cfe71f45f42 ("netdev: add queue stats") added a lot of useful stats, but only those immediately needed by virtio. Presumably virtio does not support CHECKSUM_COMPLETE, so statistic for that form of checksumming wasn't included. Other drivers will definitely need it, in fact we expect it to be needed in net-next soon (mlx5). So let's add the definition of the counter for CHECKSUM_COMPLETE to uAPI in net already, so that the counters are in a more natural order (all subsequent counters have not been present in any released kernel, yet). Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Joe Damato <jdamato@fastly.com> Fixes: 0cfe71f45f42 ("netdev: add queue stats") Link: https://lore.kernel.org/r/20240529163547.3693194-1-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-05-30ipvlan: Dont Use skb->sk in ipvlan_process_v{4,6}_outboundYue Haibing1-2/+2
Raw packet from PF_PACKET socket ontop of an IPv6-backed ipvlan device will hit WARN_ON_ONCE() in sk_mc_loop() through sch_direct_xmit() path. WARNING: CPU: 2 PID: 0 at net/core/sock.c:775 sk_mc_loop+0x2d/0x70 Modules linked in: sch_netem ipvlan rfkill cirrus drm_shmem_helper sg drm_kms_helper CPU: 2 PID: 0 Comm: swapper/2 Kdump: loaded Not tainted 6.9.0+ #279 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 RIP: 0010:sk_mc_loop+0x2d/0x70 Code: fa 0f 1f 44 00 00 65 0f b7 15 f7 96 a3 4f 31 c0 66 85 d2 75 26 48 85 ff 74 1c RSP: 0018:ffffa9584015cd78 EFLAGS: 00010212 RAX: 0000000000000011 RBX: ffff91e585793e00 RCX: 0000000002c6a001 RDX: 0000000000000000 RSI: 0000000000000040 RDI: ffff91e589c0f000 RBP: ffff91e5855bd100 R08: 0000000000000000 R09: 3d00545216f43d00 R10: ffff91e584fdcc50 R11: 00000060dd8616f4 R12: ffff91e58132d000 R13: ffff91e584fdcc68 R14: ffff91e5869ce800 R15: ffff91e589c0f000 FS: 0000000000000000(0000) GS:ffff91e898100000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f788f7c44c0 CR3: 0000000008e1a000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <IRQ> ? __warn (kernel/panic.c:693) ? sk_mc_loop (net/core/sock.c:760) ? report_bug (lib/bug.c:201 lib/bug.c:219) ? handle_bug (arch/x86/kernel/traps.c:239) ? exc_invalid_op (arch/x86/kernel/traps.c:260 (discriminator 1)) ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:621) ? sk_mc_loop (net/core/sock.c:760) ip6_finish_output2 (net/ipv6/ip6_output.c:83 (discriminator 1)) ? nf_hook_slow (net/netfilter/core.c:626) ip6_finish_output (net/ipv6/ip6_output.c:222) ? __pfx_ip6_finish_output (net/ipv6/ip6_output.c:215) ipvlan_xmit_mode_l3 (drivers/net/ipvlan/ipvlan_core.c:602) ipvlan ipvlan_start_xmit (drivers/net/ipvlan/ipvlan_main.c:226) ipvlan dev_hard_start_xmit (net/core/dev.c:3594) sch_direct_xmit (net/sched/sch_generic.c:343) __qdisc_run (net/sched/sch_generic.c:416) net_tx_action (net/core/dev.c:5286) handle_softirqs (kernel/softirq.c:555) __irq_exit_rcu (kernel/softirq.c:589) sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1043) The warning triggers as this: packet_sendmsg packet_snd //skb->sk is packet sk __dev_queue_xmit __dev_xmit_skb //q->enqueue is not NULL __qdisc_run sch_direct_xmit dev_hard_start_xmit ipvlan_start_xmit ipvlan_xmit_mode_l3 //l3 mode ipvlan_process_outbound //vepa flag ipvlan_process_v6_outbound ip6_local_out __ip6_finish_output ip6_finish_output2 //multicast packet sk_mc_loop //sk->sk_family is AF_PACKET Call ip{6}_local_out() with NULL sk in ipvlan as other tunnels to fix this. Fixes: 2ad7bf363841 ("ipvlan: Initial check-in of the IPVLAN driver.") Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20240529095633.613103-1-yuehaibing@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-05-30Merge tag 'nf-24-05-29' of ↵Paolo Abeni5-28/+82
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: Patch #1 syzbot reports that nf_reinject() could be called without rcu_read_lock() when flushing pending packets at nfnetlink queue removal, from Eric Dumazet. Patch #2 flushes ipset list:set when canceling garbage collection to reference to other lists to fix a race, from Jozsef Kadlecsik. Patch #3 restores q-in-q matching with nft_payload by reverting f6ae9f120dad ("netfilter: nft_payload: add C-VLAN support"). Patch #4 fixes vlan mangling in skbuff when vlan offload is present in skbuff, without this patch nft_payload corrupts packets in this case. Patch #5 fixes possible nul-deref in tproxy no IP address is found in netdevice, reported by syzbot and patch from Florian Westphal. Patch #6 removes a superfluous restriction which prevents loose fib lookups from input and forward hooks, from Eric Garver. My assessment is that patches #1, #2 and #5 address possible kernel crash, anything else in this batch fixes broken features. netfilter pull request 24-05-29 * tag 'nf-24-05-29' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nft_fib: allow from forward/input without iif selector netfilter: tproxy: bail out if IP has been disabled on the device netfilter: nft_payload: skbuff vlan metadata mangle support netfilter: nft_payload: restore vlan q-in-q match support netfilter: ipset: Add list flush to cancel_gc netfilter: nfnetlink_queue: acquire rcu_read_lock() in instance_destroy_rcu() ==================== Link: https://lore.kernel.org/r/20240528225519.1155786-1-pablo@netfilter.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-05-30net: ena: Fix redundant device NUMA node overrideShay Agroskin1-11/+0
The driver overrides the NUMA node id of the device regardless of whether it knows its correct value (often setting it to -1 even though the node id is advertised in 'struct device'). This can lead to suboptimal configurations. This patch fixes this behavior and makes the shared memory allocation functions use the NUMA node id advertised by the underlying device. Fixes: 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)") Signed-off-by: Shay Agroskin <shayagr@amazon.com> Link: https://lore.kernel.org/r/20240528170912.1204417-1-shayagr@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-30Merge branch 'intel-wired-lan-driver-updates-2024-05-28-e1000e-i40e-ice'Jakub Kicinski5-144/+195
Jacob Keller says: ==================== Intel Wired LAN Driver Updates 2024-05-28 (e1000e, i40e, ice) [part] This series includes a variety of fixes that have been accumulating on the Intel Wired LAN dev-queue. Hui Wang provides a fix for suspend/resume on e1000e due to failure to correctly setup the SMBUS in enable_ulp(). Thinh Tran provides a fix for EEH I/O suspend/resume on i40e to ensure that I/O operations can continue after a resume. To avoid duplicate code, the common logic is factored out of i40e_suspend and i40e_resume. Paul Greenwalt provides a fix to correctly map the 200G PHY types to link speeds in the ice driver. Dave Ertman provides a fix correcting devlink parameter unregistration in the event that the driver loads in safe mode and some of the parameters were not registered. ==================== Link: https://lore.kernel.org/r/20240528-net-2024-05-28-intel-net-fixes-v1-0-dc8593d2bbc6@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-30ice: check for unregistering correct number of devlink paramsDave Ertman1-9/+22
On module load, the ice driver checks for the lack of a specific PF capability to determine if it should reduce the number of devlink params to register. One situation when this test returns true is when the driver loads in safe mode. The same check is not present on the unload path when devlink params are unregistered. This results in the driver triggering a WARN_ON in the kernel devlink code. The current check and code path uses a reduction in the number of elements reported in the list of params. This is fragile and not good for future maintaining. Change the parameters to be held in two lists, one always registered and one dependent on the check. Add a symmetrical check in the unload path so that the correct parameters are unregistered as well. Fixes: 109eb2917284 ("ice: Add tx_scheduling_layers devlink param") CC: Lukasz Czapnik <lukasz.czapnik@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20240528-net-2024-05-28-intel-net-fixes-v1-8-dc8593d2bbc6@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-30ice: fix 200G PHY types to link speed mappingPaul Greenwalt1-0/+10
Commit 24407a01e57c ("ice: Add 200G speed/phy type use") added support for 200G PHY speeds, but did not include the mapping of 200G PHY types to link speed. As a result the driver is returning UNKNOWN link speed when setting 200G ethtool advertised link modes. To fix this add 200G PHY types to link speed mapping to ice_get_link_speed_based_on_phy_type(). Fixes: 24407a01e57c ("ice: Add 200G speed/phy type use") Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20240528-net-2024-05-28-intel-net-fixes-v1-5-dc8593d2bbc6@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-30i40e: Fully suspend and resume IO operations in EEH caseThinh Tran1-3/+6
When EEH events occurs, the callback functions in the i40e, which are managed by the EEH driver, will completely suspend and resume all IO operations. - In the PCI error detected callback, replaced i40e_prep_for_reset() with i40e_io_suspend(). The change is to fully suspend all I/O operations - In the PCI error slot reset callback, replaced pci_enable_device_mem() with pci_enable_device(). This change enables both I/O and memory of the device. - In the PCI error resume callback, replaced i40e_handle_reset_warning() with i40e_io_resume(). This change allows the system to resume I/O operations Fixes: a5f3d2c17b07 ("powerpc/pseries/pci: Add MSI domains") Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Robert Thomas <rob.thomas@ibm.com> Signed-off-by: Thinh Tran <thinhtr@linux.ibm.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20240528-net-2024-05-28-intel-net-fixes-v1-3-dc8593d2bbc6@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-30i40e: factoring out i40e_suspend/i40e_resumeThinh Tran1-114/+135
Two new functions, i40e_io_suspend() and i40e_io_resume(), have been introduced. These functions were factored out from the existing i40e_suspend() and i40e_resume() respectively. This factoring was done due to concerns about the logic of the I40E_SUSPENSED state, which caused the device to be unable to recover. The functions are now used in the EEH handling for device suspend/resume callbacks. The function i40e_enable_mc_magic_wake() has been moved ahead of i40e_io_suspend() to ensure it is declared before being used. Tested-by: Robert Thomas <rob.thomas@ibm.com> Signed-off-by: Thinh Tran <thinhtr@linux.ibm.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20240528-net-2024-05-28-intel-net-fixes-v1-2-dc8593d2bbc6@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-30e1000e: move force SMBUS near the end of enable_ulp functionHui Wang2-18/+22
The commit 861e8086029e ("e1000e: move force SMBUS from enable ulp function to avoid PHY loss issue") introduces a regression on PCH_MTP_I219_LM18 (PCIID: 0x8086550A). Without the referred commit, the ethernet works well after suspend and resume, but after applying the commit, the ethernet couldn't work anymore after the resume and the dmesg shows that the NIC link changes to 10Mbps (1000Mbps originally): [ 43.305084] e1000e 0000:00:1f.6 enp0s31f6: NIC Link is Up 10 Mbps Full Duplex, Flow Control: Rx/Tx Without the commit, the force SMBUS code will not be executed if "return 0" or "goto out" is executed in the enable_ulp(), and in my case, the "goto out" is executed since FWSM_FW_VALID is set. But after applying the commit, the force SMBUS code will be ran unconditionally. Here move the force SMBUS code back to enable_ulp() and put it immediately ahead of hw->phy.ops.release(hw), this could allow the longest settling time as possible for interface in this function and doesn't change the original code logic. The issue was found on a Lenovo laptop with the ethernet hw as below: 00:1f.6 Ethernet controller [0200]: Intel Corporation Device [8086:550a] (rev 20). And this patch is verified (cable plug and unplug, system suspend and resume) on Lenovo laptops with ethernet hw: [8086:550a], [8086:550b], [8086:15bb], [8086:15be], [8086:1a1f], [8086:1a1c] and [8086:0dc7]. Fixes: 861e8086029e ("e1000e: move force SMBUS from enable ulp function to avoid PHY loss issue") Signed-off-by: Hui Wang <hui.wang@canonical.com> Acked-by: Vitaly Lifshits <vitaly.lifshits@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20240528-net-2024-05-28-intel-net-fixes-v1-1-dc8593d2bbc6@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-30net: dsa: microchip: fix RGMII error in KSZ DSA driverTristram Ha1-1/+1
The driver should return RMII interface when XMII is running in RMII mode. Fixes: 0ab7f6bf1675 ("net: dsa: microchip: ksz9477: use common xmii function") Signed-off-by: Tristram Ha <tristram.ha@microchip.com> Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com> Acked-by: Jerry Ray <jerry.ray@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/1716932066-3342-1-git-send-email-Tristram.Ha@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-30ipv4: correctly iterate over the target netns in inet_dump_ifaddr()Alexander Mikhalitsyn1-1/+1
A recent change to inet_dump_ifaddr had the function incorrectly iterate over net rather than tgt_net, resulting in the data coming for the incorrect network namespace. Fixes: cdb2f80f1c10 ("inet: use xa_array iterator to implement inet_dump_ifaddr()") Reported-by: Stéphane Graber <stgraber@stgraber.org> Closes: https://github.com/lxc/incus/issues/892 Bisected-by: Stéphane Graber <stgraber@stgraber.org> Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com> Tested-by: Stéphane Graber <stgraber@stgraber.org> Acked-by: Christian Brauner <brauner@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20240528203030.10839-1-aleksandr.mikhalitsyn@canonical.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-30net: fix __dst_negative_advice() raceEric Dumazet5-47/+30
__dst_negative_advice() does not enforce proper RCU rules when sk->dst_cache must be cleared, leading to possible UAF. RCU rules are that we must first clear sk->sk_dst_cache, then call dst_release(old_dst). Note that sk_dst_reset(sk) is implementing this protocol correctly, while __dst_negative_advice() uses the wrong order. Given that ip6_negative_advice() has special logic against RTF_CACHE, this means each of the three ->negative_advice() existing methods must perform the sk_dst_reset() themselves. Note the check against NULL dst is centralized in __dst_negative_advice(), there is no need to duplicate it in various callbacks. Many thanks to Clement Lecigne for tracking this issue. This old bug became visible after the blamed commit, using UDP sockets. Fixes: a87cb3e48ee8 ("net: Facility to report route quality of connected sockets") Reported-by: Clement Lecigne <clecigne@google.com> Diagnosed-by: Clement Lecigne <clecigne@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tom Herbert <tom@herbertland.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20240528114353.1794151-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-29Revert "vfs: Delete the associated dentry when deleting a file"Linus Torvalds1-7/+8
This reverts commit 681ce8623567ba7e7333908e9826b77145312dda. We gave it a try, but it turns out the kernel test robot did in fact find performance regressions for it, so we'll have to look at the more involved alternative fixes for Yafang Shao's Elasticsearch load issue. There were several alternatives discussed, they just weren't as simple as this first attempt. The report is of a -7.4% regression of filebench.sum_operations/s, which appears significant enough to trigger my "this patch may get reverted if somebody finds a performance regression on some other load" rule. So it's still the case that we should end up deleting dentries more aggressively - or just be better at pruning them later - but it needs a bit more finesse than this simple thing. Link: https://lore.kernel.org/all/202405291318.4dfbb352-oliver.sang@intel.com/ Cc: Yafang Shao <laoar.shao@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-05-29Merge tag '9p-for-6.10-rc2' of https://github.com/martinetd/linuxLinus Torvalds2-2/+9
Pull 9p fixes from Dominique Martinet: "Two fixes headed to stable trees: - a trace event was dumping uninitialized values - a missing lock that was thought to have exclusive access, and it turned out not to" * tag '9p-for-6.10-rc2' of https://github.com/martinetd/linux: 9p: add missing locking around taking dentry fid list net/9p: fix uninit-value in p9_client_rpc()
2024-05-29Merge tag 'v6.10-p3' of ↵Linus Torvalds1-43/+4
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fix from Herbert Xu: "This fixes a new run-time warning triggered by tpm" * tag 'v6.10-p3' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: hwrng: core - Remove add_early_randomness
2024-05-29wifi: mac80211: fix UBSAN noise in ieee80211_prep_hw_scan()Dmitry Antipov1-4/+10
When testing the previous patch with CONFIG_UBSAN_BOUNDS, I've noticed the following: UBSAN: array-index-out-of-bounds in net/mac80211/scan.c:372:4 index 0 is out of range for type 'struct ieee80211_channel *[]' CPU: 0 PID: 1435 Comm: wpa_supplicant Not tainted 6.9.0+ #1 Hardware name: LENOVO 20UN005QRT/20UN005QRT <...BIOS details...> Call Trace: <TASK> dump_stack_lvl+0x2d/0x90 __ubsan_handle_out_of_bounds+0xe7/0x140 ? timerqueue_add+0x98/0xb0 ieee80211_prep_hw_scan+0x2db/0x480 [mac80211] ? __kmalloc+0xe1/0x470 __ieee80211_start_scan+0x541/0x760 [mac80211] rdev_scan+0x1f/0xe0 [cfg80211] nl80211_trigger_scan+0x9b6/0xae0 [cfg80211] ...<the rest is not too useful...> Since '__ieee80211_start_scan()' leaves 'hw_scan_req->req.n_channels' uninitialized, actual boundaries of 'hw_scan_req->req.channels' can't be checked in 'ieee80211_prep_hw_scan()'. Although an initialization of 'hw_scan_req->req.n_channels' introduces some confusion around allocated vs. used VLA members, this shouldn't be a problem since everything is correctly adjusted soon in 'ieee80211_prep_hw_scan()'. Cleanup 'kmalloc()' math in '__ieee80211_start_scan()' by using the convenient 'struct_size()' as well. Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Link: https://msgid.link/20240517153332.18271-2-dmantipov@yandex.ru [improve (imho) indentation a bit] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-05-29wifi: mac80211: correctly parse Spatial Reuse Parameter Set elementLingbo Kong1-2/+8
Currently, the way of parsing Spatial Reuse Parameter Set element is incorrect and some members of struct ieee80211_he_obss_pd are not assigned. To address this issue, it must be parsed in the order of the elements of Spatial Reuse Parameter Set defined in the IEEE Std 802.11ax specification. The diagram of the Spatial Reuse Parameter Set element (IEEE Std 802.11ax -2021-9.4.2.252). ------------------------------------------------------------------------- | | | | |Non-SRG| SRG | SRG | SRG | SRG | |Element|Length| Element | SR |OBSS PD|OBSS PD|OBSS PD| BSS |Partial| | ID | | ID |Control| Max | Min | Max |Color | BSSID | | | |Extension| | Offset| Offset|Offset |Bitmap|Bitmap | ------------------------------------------------------------------------- Fixes: 1ced169cc1c2 ("mac80211: allow setting spatial reuse parameters from bss_conf") Signed-off-by: Lingbo Kong <quic_lingbok@quicinc.com> Link: https://msgid.link/20240516021854.5682-3-quic_lingbok@quicinc.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-05-29wifi: mac80211: fix Spatial Reuse element size checkLingbo Kong1-1/+1
Currently, the way to check the size of Spatial Reuse IE data in the ieee80211_parse_extension_element() is incorrect. This is because the len variable in the ieee80211_parse_extension_element() function is equal to the size of Spatial Reuse IE data minus one and the value of returned by the ieee80211_he_spr_size() function is equal to the length of Spatial Reuse IE data. So the result of the len >= ieee80211_he_spr_size(data) statement always false. To address this issue and make it consistent with the logic used elsewhere with ieee80211_he_oper_size(), change the "len >= ieee80211_he_spr_size(data)" to “len >= ieee80211_he_spr_size(data) - 1”. Fixes: 9d0480a7c05b ("wifi: mac80211: move element parsing to a new file") Signed-off-by: Lingbo Kong <quic_lingbok@quicinc.com> Link: https://msgid.link/20240516021854.5682-2-quic_lingbok@quicinc.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-05-29wifi: iwlwifi: mvm: don't read past the mfuart notifcationEmmanuel Grumbach1-10/+0
In case the firmware sends a notification that claims it has more data than it has, we will read past that was allocated for the notification. Remove the print of the buffer, we won't see it by default. If needed, we can see the content with tracing. This was reported by KFENCE. Fixes: bdccdb854f2f ("iwlwifi: mvm: support MFUART dump in case of MFUART assert") Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Reviewed-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20240513132416.ba82a01a559e.Ia91dd20f5e1ca1ad380b95e68aebf2794f553d9b@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>