summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2023-06-18Merge tag 'ata-6.4-rc7' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata Pull ata fix from Damien Le Moal: - Avoid deadlocks on resume from sleep by delaying scsi rescan until the scsi device is also fully resumed. * tag 'ata-6.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata: ata: libata-scsi: Avoid deadlock on rescan after device resume
2023-06-18ata: libata-scsi: Avoid deadlock on rescan after device resumeDamien Le Moal1-1/+1
When an ATA port is resumed from sleep, the port is reset and a power management request issued to libata EH to reset the port and rescanning the device(s) attached to the port. Device rescanning is done by scheduling an ata_scsi_dev_rescan() work, which will execute scsi_rescan_device(). However, scsi_rescan_device() takes the generic device lock, which is also taken by dpm_resume() when the SCSI device is resumed as well. If a device rescan execution starts before the completion of the SCSI device resume, the rcu locking used to refresh the cached VPD pages of the device, combined with the generic device locking from scsi_rescan_device() and from dpm_resume() can cause a deadlock. Avoid this situation by changing struct ata_port scsi_rescan_task to be a delayed work instead of a simple work_struct. ata_scsi_dev_rescan() is modified to check if the SCSI device associated with the ATA device that must be rescanned is not suspended. If the SCSI device is still suspended, ata_scsi_dev_rescan() returns early and reschedule itself for execution after an arbitrary delay of 5ms. Reported-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Reported-by: Joe Breuer <linux-kernel@jmbreuer.net> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217530 Fixes: a19a93e4c6a9 ("scsi: core: pm: Rely on the device driver core for async power management") Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Tested-by: Joe Breuer <linux-kernel@jmbreuer.net>
2023-06-16Merge tag 'urgent-rcu.2023.06.11a' of ↵Linus Torvalds1-0/+10
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu Pull RCU fix from Paul McKenney: "This fixes a spinlock-initialization regression in SRCU that causes the SRCU notifier to fail. The fix simply adds the initialization, but introduces a #ifdef because there is no spinlock to initialize for the Tiny SRCU used in !SMP builds. Yes, it would be nice to abstract this somehow in order to hide it in SRCU, but I still don't see a good way of doing this" * tag 'urgent-rcu.2023.06.11a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: notifier: Initialize new struct srcu_usage field
2023-06-16Merge tag 'net-6.4-rc7' of ↵Linus Torvalds4-3/+13
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from wireless, and netfilter. Selftests excluded - we have 58 patches and diff of +442/-199, which isn't really small but perhaps with the exception of the WiFi locking change it's old(ish) bugs. We have no known problems with v6.4. The selftest changes are rather large as MPTCP folks try to apply Greg's guidance that selftest from torvalds/linux should be able to run against stable kernels. Last thing I should call out is the DCCP/UDP-lite deprecation notices. We are fairly sure those are dead, but if we're wrong reverting them back in won't be fun. Current release - regressions: - wifi: - cfg80211: fix double lock bug in reg_wdev_chan_valid() - iwlwifi: mvm: spin_lock_bh() to fix lockdep regression Current release - new code bugs: - handshake: remove fput() that causes use-after-free Previous releases - regressions: - sched: cls_u32: fix reference counter leak leading to overflow - sched: cls_api: fix lockup on flushing explicitly created chain Previous releases - always broken: - nf_tables: integrate pipapo into commit protocol - nf_tables: incorrect error path handling with NFT_MSG_NEWRULE, fix dangling pointer on failure - ping6: fix send to link-local addresses with VRF - sched: act_pedit: parse L3 header for L4 offset, the skb may not have the offset saved - sched: act_ct: fix promotion of offloaded unreplied tuple - sched: refuse to destroy an ingress and clsact Qdiscs if there are lockless change operations in flight - wifi: mac80211: fix handful of bugs in multi-link operation - ipvlan: fix bound dev checking for IPv6 l3s mode - eth: enetc: correct the indexes of highest and 2nd highest TCs - eth: ice: fix XDP memory leak when NIC is brought up and down Misc: - add deprecation notices for UDP-lite and DCCP - selftests: mptcp: skip tests not supported by old kernels - sctp: handle invalid error codes without calling BUG()" * tag 'net-6.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (91 commits) dccp: Print deprecation notice. udplite: Print deprecation notice. octeon_ep: Add missing check for ioremap selftests/ptp: Fix timestamp printf format for PTP_SYS_OFFSET net: ethernet: stmicro: stmmac: fix possible memory leak in __stmmac_open net: tipc: resize nlattr array to correct size sfc: fix XDP queues mode with legacy IRQ net: macsec: fix double free of percpu stats net: lapbether: only support ethernet devices MAINTAINERS: add reviewers for SMC Sockets s390/ism: Fix trying to free already-freed IRQ by repeated ism_dev_exit() net: dsa: felix: fix taprio guard band overflow at 10Mbps with jumbo frames net/sched: cls_api: Fix lockup on flushing explicitly created chain ice: Fix ice module unload net/handshake: remove fput() that causes use-after-free selftests: forwarding: hw_stats_l3: Set addrgenmode in a separate step net/sched: qdisc_destroy() old ingress and clsact Qdiscs before grafting net/sched: Refactor qdisc_graft() for ingress and clsact Qdiscs net/sched: act_ct: Fix promotion of offloaded unreplied tuple wifi: iwlwifi: mvm: spin_lock_bh() to fix lockdep regression ...
2023-06-16Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds2-23/+12
Pull rdma fixes from Jason Gunthorpe: "This is an unusually large bunch of bug fixes for the later rc cycle, rxe and mlx5 both dumped a lot of things at once. rxe continues to fix itself, and mlx5 is fixing a bunch of "queue counters" related bugs. There is one highly notable bug fix regarding the qkey. This small security check was missed in the original 2005 implementation and it allows some significant issues. Summary: - Two rtrs bug fixes for error unwind bugs - Several rxe bug fixes: * Incorrect Rx packet validation * Using memory without a refcount * Syzkaller found use before initialization * Regression fix for missing locking with the tasklet conversion from this merge window - Have bnxt report the correct link properties to userspace, this was a regression in v6.3 - Several mlx5 bug fixes: * Kernel crash triggerable by userspace for the RAW ethernet profile * Defend against steering refcounting issues created by userspace * Incorrect change of QP port affinity parameters in some LAG configurations - Fix mlx5 Q counters: * Do not over allocate Q counters to allow userspace to use the full port capacity * Kernel crash triggered by eswitch due to mis-use of Q counters * Incorrect mlx5_device for Q counters in some LAG configurations - Properly implement the IBA spec restricting privileged qkeys to root - Always an error when reading from a disassociated device's event queue - isert bug fixes: * Avoid a deadlock with the CM handler and CM ID destruction * Correct list corruption due to incorrect locking * Fix a use after free around connection tear down" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: RDMA/rxe: Fix rxe_cq_post IB/isert: Fix incorrect release of isert connection IB/isert: Fix possible list corruption in CMA handler IB/isert: Fix dead lock in ib_isert RDMA/mlx5: Fix affinity assignment IB/uverbs: Fix to consider event queue closing also upon non-blocking mode RDMA/uverbs: Restrict usage of privileged QKEYs RDMA/cma: Always set static rate to 0 for RoCE RDMA/mlx5: Fix Q-counters query in LAG mode RDMA/mlx5: Remove vport Q-counters dependency on normal Q-counters RDMA/mlx5: Fix Q-counters per vport allocation RDMA/mlx5: Create an indirect flow table for steering anchor RDMA/mlx5: Initiate dropless RQ for RAW Ethernet functions RDMA/rxe: Fix the use-before-initialization error of resp_pkts RDMA/bnxt_re: Fix reporting active_{speed,width} attributes RDMA/rxe: Fix ref count error in check_rkey() RDMA/rxe: Fix packet length checks RDMA/rtrs: Fix rxe_dealloc_pd warning RDMA/rtrs: Fix the last iu->buf leak in err path
2023-06-16Merge tag 'media/v6.4-6' of ↵Linus Torvalds1-5/+1
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fixes from Mauro Carvalho Chehab: "A fix for dvb-core to avoid a race condition during DVB board registration" * tag 'media/v6.4-6' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: Revert "media: dvb-core: Fix use-after-free on race condition at dvb_frontend"
2023-06-15Revert "media: dvb-core: Fix use-after-free on race condition at dvb_frontend"Mauro Carvalho Chehab1-5/+1
As reported by Thomas Voegtle <tv@lio96.de>, sometimes a DVB card does not initialize properly booting Linux 6.4-rc4. This is not always, maybe in 3 out of 4 attempts. After double-checking, the root cause seems to be related to the UAF fix, which is causing a race issue: [ 26.332149] tda10071 7-0005: found a 'NXP TDA10071' in cold state, will try to load a firmware [ 26.340779] tda10071 7-0005: downloading firmware from file 'dvb-fe-tda10071.fw' [ 989.277402] INFO: task vdr:743 blocked for more than 491 seconds. [ 989.283504] Not tainted 6.4.0-rc5-i5 #249 [ 989.288036] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 989.295860] task:vdr state:D stack:0 pid:743 ppid:711 flags:0x00004002 [ 989.295865] Call Trace: [ 989.295867] <TASK> [ 989.295869] __schedule+0x2ea/0x12d0 [ 989.295877] ? asm_sysvec_apic_timer_interrupt+0x16/0x20 [ 989.295881] schedule+0x57/0xc0 [ 989.295884] schedule_preempt_disabled+0xc/0x20 [ 989.295887] __mutex_lock.isra.16+0x237/0x480 [ 989.295891] ? dvb_get_property.isra.10+0x1bc/0xa50 [ 989.295898] ? dvb_frontend_stop+0x36/0x180 [ 989.338777] dvb_frontend_stop+0x36/0x180 [ 989.338781] dvb_frontend_open+0x2f1/0x470 [ 989.338784] dvb_device_open+0x81/0xf0 [ 989.338804] ? exact_lock+0x20/0x20 [ 989.338808] chrdev_open+0x7f/0x1c0 [ 989.338811] ? generic_permission+0x1a2/0x230 [ 989.338813] ? link_path_walk.part.63+0x340/0x380 [ 989.338815] ? exact_lock+0x20/0x20 [ 989.338817] do_dentry_open+0x18e/0x450 [ 989.374030] path_openat+0xca5/0xe00 [ 989.374031] ? terminate_walk+0xec/0x100 [ 989.374034] ? path_lookupat+0x93/0x140 [ 989.374036] do_filp_open+0xc0/0x140 [ 989.374038] ? __call_rcu_common.constprop.91+0x92/0x240 [ 989.374041] ? __check_object_size+0x147/0x260 [ 989.374043] ? __check_object_size+0x147/0x260 [ 989.374045] ? alloc_fd+0xbb/0x180 [ 989.374048] ? do_sys_openat2+0x243/0x310 [ 989.374050] do_sys_openat2+0x243/0x310 [ 989.374052] do_sys_open+0x52/0x80 [ 989.374055] do_syscall_64+0x5b/0x80 [ 989.421335] ? __task_pid_nr_ns+0x92/0xa0 [ 989.421337] ? syscall_exit_to_user_mode+0x20/0x40 [ 989.421339] ? do_syscall_64+0x67/0x80 [ 989.421341] ? syscall_exit_to_user_mode+0x20/0x40 [ 989.421343] ? do_syscall_64+0x67/0x80 [ 989.421345] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 989.421348] RIP: 0033:0x7fe895d067e3 [ 989.421349] RSP: 002b:00007fff933c2ba0 EFLAGS: 00000293 ORIG_RAX: 0000000000000101 [ 989.421351] RAX: ffffffffffffffda RBX: 00007fff933c2c10 RCX: 00007fe895d067e3 [ 989.421352] RDX: 0000000000000802 RSI: 00005594acdce160 RDI: 00000000ffffff9c [ 989.421353] RBP: 0000000000000802 R08: 0000000000000000 R09: 0000000000000000 [ 989.421353] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000001 [ 989.421354] R13: 00007fff933c2ca0 R14: 00000000ffffffff R15: 00007fff933c2c90 [ 989.421355] </TASK> This reverts commit 6769a0b7ee0c3b31e1b22c3fadff2bfb642de23f. Fixes: 6769a0b7ee0c ("media: dvb-core: Fix use-after-free on race condition at dvb_frontend") Link: https://lore.kernel.org/all/da5382ad-09d6-20ac-0d53-611594b30861@lio96.de/ Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
2023-06-14net/sched: qdisc_destroy() old ingress and clsact Qdiscs before graftingPeilin Ye1-0/+8
mini_Qdisc_pair::p_miniq is a double pointer to mini_Qdisc, initialized in ingress_init() to point to net_device::miniq_ingress. ingress Qdiscs access this per-net_device pointer in mini_qdisc_pair_swap(). Similar for clsact Qdiscs and miniq_egress. Unfortunately, after introducing RTNL-unlocked RTM_{NEW,DEL,GET}TFILTER requests (thanks Hillf Danton for the hint), when replacing ingress or clsact Qdiscs, for example, the old Qdisc ("@old") could access the same miniq_{in,e}gress pointer(s) concurrently with the new Qdisc ("@new"), causing race conditions [1] including a use-after-free bug in mini_qdisc_pair_swap() reported by syzbot: BUG: KASAN: slab-use-after-free in mini_qdisc_pair_swap+0x1c2/0x1f0 net/sched/sch_generic.c:1573 Write of size 8 at addr ffff888045b31308 by task syz-executor690/14901 ... Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xd9/0x150 lib/dump_stack.c:106 print_address_description.constprop.0+0x2c/0x3c0 mm/kasan/report.c:319 print_report mm/kasan/report.c:430 [inline] kasan_report+0x11c/0x130 mm/kasan/report.c:536 mini_qdisc_pair_swap+0x1c2/0x1f0 net/sched/sch_generic.c:1573 tcf_chain_head_change_item net/sched/cls_api.c:495 [inline] tcf_chain0_head_change.isra.0+0xb9/0x120 net/sched/cls_api.c:509 tcf_chain_tp_insert net/sched/cls_api.c:1826 [inline] tcf_chain_tp_insert_unique net/sched/cls_api.c:1875 [inline] tc_new_tfilter+0x1de6/0x2290 net/sched/cls_api.c:2266 ... @old and @new should not affect each other. In other words, @old should never modify miniq_{in,e}gress after @new, and @new should not update @old's RCU state. Fixing without changing sch_api.c turned out to be difficult (please refer to Closes: for discussions). Instead, make sure @new's first call always happen after @old's last call (in {ingress,clsact}_destroy()) has finished: In qdisc_graft(), return -EBUSY if @old has any ongoing filter requests, and call qdisc_destroy() for @old before grafting @new. Introduce qdisc_refcount_dec_if_one() as the counterpart of qdisc_refcount_inc_nz() used for filter requests. Introduce a non-static version of qdisc_destroy() that does a TCQ_F_BUILTIN check, just like qdisc_put() etc. Depends on patch "net/sched: Refactor qdisc_graft() for ingress and clsact Qdiscs". [1] To illustrate, the syzkaller reproducer adds ingress Qdiscs under TC_H_ROOT (no longer possible after commit c7cfbd115001 ("net/sched: sch_ingress: Only create under TC_H_INGRESS")) on eth0 that has 8 transmission queues: Thread 1 creates ingress Qdisc A (containing mini Qdisc a1 and a2), then adds a flower filter X to A. Thread 2 creates another ingress Qdisc B (containing mini Qdisc b1 and b2) to replace A, then adds a flower filter Y to B. Thread 1 A's refcnt Thread 2 RTM_NEWQDISC (A, RTNL-locked) qdisc_create(A) 1 qdisc_graft(A) 9 RTM_NEWTFILTER (X, RTNL-unlocked) __tcf_qdisc_find(A) 10 tcf_chain0_head_change(A) mini_qdisc_pair_swap(A) (1st) | | RTM_NEWQDISC (B, RTNL-locked) RCU sync 2 qdisc_graft(B) | 1 notify_and_destroy(A) | tcf_block_release(A) 0 RTM_NEWTFILTER (Y, RTNL-unlocked) qdisc_destroy(A) tcf_chain0_head_change(B) tcf_chain0_head_change_cb_del(A) mini_qdisc_pair_swap(B) (2nd) mini_qdisc_pair_swap(A) (3rd) | ... ... Here, B calls mini_qdisc_pair_swap(), pointing eth0->miniq_ingress to its mini Qdisc, b1. Then, A calls mini_qdisc_pair_swap() again during ingress_destroy(), setting eth0->miniq_ingress to NULL, so ingress packets on eth0 will not find filter Y in sch_handle_ingress(). This is just one of the possible consequences of concurrently accessing miniq_{in,e}gress pointers. Fixes: 7a096d579e8e ("net: sched: ingress: set 'unlocked' flag for Qdisc ops") Fixes: 87f373921c4e ("net: sched: ingress: set 'unlocked' flag for clsact Qdisc ops") Reported-by: syzbot+b53a9c0d1ea4ad62da8b@syzkaller.appspotmail.com Closes: https://lore.kernel.org/r/0000000000006cf87705f79acf1a@google.com/ Cc: Hillf Danton <hdanton@sina.com> Cc: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Peilin Ye <peilin.ye@bytedance.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-14net/sched: act_ct: Fix promotion of offloaded unreplied tuplePaul Blakey1-1/+1
Currently UNREPLIED and UNASSURED connections are added to the nf flow table. This causes the following connection packets to be processed by the flow table which then skips conntrack_in(), and thus such the connections will remain UNREPLIED and UNASSURED even if reply traffic is then seen. Even still, the unoffloaded reply packets are the ones triggering hardware update from new to established state, and if there aren't any to triger an update and/or previous update was missed, hardware can get out of sync with sw and still mark packets as new. Fix the above by: 1) Not skipping conntrack_in() for UNASSURED packets, but still refresh for hardware, as before the cited patch. 2) Try and force a refresh by reply-direction packets that update the hardware rules from new to established state. 3) Remove any bidirectional flows that didn't failed to update in hardware for re-insertion as bidrectional once any new packet arrives. Fixes: 6a9bad0069cf ("net/sched: act_ct: offload UDP NEW connections") Co-developed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Florian Westphal <fw@strlen.de> Link: https://lore.kernel.org/r/1686313379-117663-1-git-send-email-paulb@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-12net: ethtool: correct MAX attribute value for statsJakub Kicinski1-1/+1
When compiling YNL generated code compiler complains about array-initializer-out-of-bounds. Turns out the MAX value for STATS_GRP uses the value for STATS. This may lead to random corruptions in user space (kernel itself doesn't use this value as it never parses stats). Fixes: f09ea6fb1272 ("ethtool: add a new command for reading standard stats") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-11RDMA/mlx5: Fix affinity assignmentMark Bloch1-0/+12
The cited commit aimed to ensure that Virtual Functions (VFs) assign a queue affinity to a Queue Pair (QP) to distribute traffic when the LAG master creates a hardware LAG. If the affinity was set while the hardware was not in LAG, the firmware would ignore the affinity value. However, this commit unintentionally assigned an affinity to QPs on the LAG master's VPORT even if the RDMA device was not marked as LAG-enabled. In most cases, this was not an issue because when the hardware entered hardware LAG configuration, the RDMA device of the LAG master would be destroyed and a new one would be created, marked as LAG-enabled. The problem arises when a user configures Equal-Cost Multipath (ECMP). In ECMP mode, traffic can be directed to different physical ports based on the queue affinity, which is intended for use by VPORTS other than the E-Switch manager. ECMP mode is supported only if both E-Switch managers are in switchdev mode and the appropriate route is configured via IP. In this configuration, the RDMA device is not destroyed, and we retain the RDMA device that is not marked as LAG-enabled. To ensure correct behavior, Send Queues (SQs) opened by the E-Switch manager through verbs should be assigned strict affinity. This means they will only be able to communicate through the native physical port associated with the E-Switch manager. This will prevent the firmware from assigning affinity and will not allow the SQs to be remapped in case of failover. Fixes: 802dcc7fc5ec ("RDMA/mlx5: Support TX port affinity for VF drivers in LAG mode") Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://lore.kernel.org/r/425b05f4da840bc684b0f7e8ebf61aeb5cef09b0.1685960567.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-06-11RDMA/cma: Always set static rate to 0 for RoCEMark Zhang1-23/+0
Set static rate to 0 as it should be discovered by path query and has no meaning for RoCE. This also avoid of using the rtnl lock and ethtool API, which is a bottleneck when try to setup many rdma-cm connections at the same time, especially with multiple processes. Fixes: 3c86aa70bf67 ("RDMA/cm: Add RDMA CM support for IBoE devices") Signed-off-by: Mark Zhang <markzhang@nvidia.com> Link: https://lore.kernel.org/r/f72a4f8b667b803aee9fa794069f61afb5839ce4.1685960567.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-06-10Merge tag 'arm-fixes-6.4-2' of ↵Linus Torvalds2-6/+9
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC fixes from Arnd Bergmann: "Most of the changes this time are for the Qualcomm Snapdragon platforms. There are bug fixes for error handling in Qualcomm icc-bwmon, rpmh-rsc, ramp_controller and rmtfs driver as well as the AMD tee firmware driver and a missing initialization in the Arm ff-a firmware driver. The Qualcomm RPMh and EDAC drivers need some rework to work correctly on all supported chips. The DT fixes include: - i.MX8 fixes for gpio, pinmux and clock settings - ADS touchscreen gpio polarity settings in several machines - Address dtb warnings for caches, panel and input-enable properties on Qualcomm platforms - Incorrect data on qualcomm platforms fir SA8155P power domains, SM8550 LLCC, SC7180-lite SDRAM frequencies and SM8550 soundwire - Remoteproc firmware paths are corrected for Sony Xperia 10 IV" * tag 'arm-fixes-6.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (36 commits) firmware: arm_ffa: Set handle field to zero in memory descriptor ARM: dts: Fix erroneous ADS touchscreen polarities arm64: dts: imx8mn-beacon: Fix SPI CS pinmux arm64: dts: imx8-ss-dma: assign default clock rate for lpuarts arm64: dts: imx8qm-mek: correct GPIOs for USDHC2 CD and WP signals EDAC/qcom: Get rid of hardcoded register offsets EDAC/qcom: Remove superfluous return variable assignment in qcom_llcc_core_setup() arm64: dts: qcom: sm8550: Use the correct LLCC register scheme dt-bindings: cache: qcom,llcc: Fix SM8550 description arm64: dts: qcom: sc7180-lite: Fix SDRAM freq for misidentified sc7180-lite boards arm64: dts: qcom: sm8550: use uint16 for Soundwire interval soc: qcom: rpmhpd: Add SA8155P power domains arm64: dts: qcom: Split out SA8155P and use correct RPMh power domains dt-bindings: power: qcom,rpmpd: Add SA8155P soc: qcom: Rename ice to qcom_ice to avoid module name conflict soc: qcom: rmtfs: Fix error code in probe() soc: qcom: ramp_controller: Fix an error handling path in qcom_ramp_controller_probe() ARM: dts: at91: sama7g5ek: fix debounce delay property for shdwc ARM: at91: pm: fix imbalanced reference counter for ethernet devices arm64: dts: qcom: sm6375-pdx225: Fix remoteproc firmware paths ...
2023-06-10Merge tag 'nf-23-06-08' of ↵David S. Miller1-1/+3
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf netfilter pull request 23-06-08 Pablo Neira Ayuso says: ==================== The following patchset contains Netfilter fixes for net: 1) Add commit and abort set operation to pipapo set abort path. 2) Bail out immediately in case of ENOMEM in nfnetlink batch. 3) Incorrect error path handling when creating a new rule leads to dangling pointer in set transaction list. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-08Merge tag 'net-6.4-rc6' of ↵Linus Torvalds11-21/+33
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from can, wifi, netfilter, bluetooth and ebpf. Current release - regressions: - bpf: sockmap: avoid potential NULL dereference in sk_psock_verdict_data_ready() - wifi: iwlwifi: fix -Warray-bounds bug in iwl_mvm_wait_d3_notif() - phylink: actually fix ksettings_set() ethtool call - eth: dwmac-qcom-ethqos: fix a regression on EMAC < 3 Current release - new code bugs: - wifi: mt76: fix possible NULL pointer dereference in mt7996_mac_write_txwi() Previous releases - regressions: - netfilter: fix NULL pointer dereference in nf_confirm_cthelper - wifi: rtw88/rtw89: correct PS calculation for SUPPORTS_DYNAMIC_PS - openvswitch: fix upcall counter access before allocation - bluetooth: - fix use-after-free in hci_remove_ltk/hci_remove_irk - fix l2cap_disconnect_req deadlock - nic: bnxt_en: prevent kernel panic when receiving unexpected PHC_UPDATE event Previous releases - always broken: - core: annotate rfs lockless accesses - sched: fq_pie: ensure reasonable TCA_FQ_PIE_QUANTUM values - netfilter: add null check for nla_nest_start_noflag() in nft_dump_basechain_hook() - bpf: fix UAF in task local storage - ipv4: ping_group_range: allow GID from 2147483648 to 4294967294 - ipv6: rpl: fix route of death. - tcp: gso: really support BIG TCP - mptcp: fixes for user-space PM address advertisement - smc: avoid to access invalid RMBs' MRs in SMCRv1 ADD LINK CONT - can: avoid possible use-after-free when j1939_can_rx_register fails - batman-adv: fix UaF while rescheduling delayed work - eth: qede: fix scheduling while atomic - eth: ice: make writes to /dev/gnssX synchronous" * tag 'net-6.4-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (83 commits) bnxt_en: Implement .set_port / .unset_port UDP tunnel callbacks bnxt_en: Prevent kernel panic when receiving unexpected PHC_UPDATE event bnxt_en: Skip firmware fatal error recovery if chip is not accessible bnxt_en: Query default VLAN before VNIC setup on a VF bnxt_en: Don't issue AP reset during ethtool's reset operation bnxt_en: Fix bnxt_hwrm_update_rss_hash_cfg() net: bcmgenet: Fix EEE implementation eth: ixgbe: fix the wake condition eth: bnxt: fix the wake condition lib: cpu_rmap: Fix potential use-after-free in irq_cpu_rmap_release() bpf: Add extra path pointer check to d_path helper net: sched: fix possible refcount leak in tc_chain_tmplt_add() net: sched: act_police: fix sparse errors in tcf_police_dump() net: openvswitch: fix upcall counter access before allocation net: sched: move rtm_tca_policy declaration to include file ice: make writes to /dev/gnssX synchronous net: sched: add rcu annotations around qdisc->qdisc_sleeping rfs: annotate lockless accesses to RFS sock flow table rfs: annotate lockless accesses to sk->sk_rxhash virtio_net: use control_buf for coalesce params ...
2023-06-08Merge tag 'for-netdev' of ↵Jakub Kicinski1-0/+1
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2023-06-07 We've added 7 non-merge commits during the last 7 day(s) which contain a total of 12 files changed, 112 insertions(+), 7 deletions(-). The main changes are: 1) Fix a use-after-free in BPF's task local storage, from KP Singh. 2) Make struct path handling more robust in bpf_d_path, from Jiri Olsa. 3) Fix a syzbot NULL-pointer dereference in sockmap, from Eric Dumazet. 4) UAPI fix for BPF_NETFILTER before final kernel ships, from Florian Westphal. 5) Fix map-in-map array_map_gen_lookup code generation where elem_size was not being set for inner maps, from Rhys Rustad-Elliott. 6) Fix sockopt_sk selftest's NETLINK_LIST_MEMBERSHIPS assertion, from Yonghong Song. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: bpf: Add extra path pointer check to d_path helper selftests/bpf: Fix sockopt_sk selftest bpf: netfilter: Add BPF_NETFILTER bpf_attach_type selftests/bpf: Add access_inner_map selftest bpf: Fix elem_size not being set for inner maps bpf: Fix UAF in task local storage bpf, sockmap: Avoid potential NULL dereference in sk_psock_verdict_data_ready() ==================== Link: https://lore.kernel.org/r/20230607220514.29698-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08netfilter: nf_tables: integrate pipapo into commit protocolPablo Neira Ayuso1-1/+3
The pipapo set backend follows copy-on-update approach, maintaining one clone of the existing datastructure that is being updated. The clone and current datastructures are swapped via rcu from the commit step. The existing integration with the commit protocol is flawed because there is no operation to clean up the clone if the transaction is aborted. Moreover, the datastructure swap happens on set element activation. This patch adds two new operations for sets: commit and abort, these new operations are invoked from the commit and abort steps, after the transactions have been digested, and it updates the pipapo set backend to use it. This patch adds a new ->pending_update field to sets to maintain a list of sets that require this new commit and abort operations. Fixes: 3c4287f62044 ("nf_tables: Add set type for arbitrary concatenation of ranges") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-06-07notifier: Initialize new struct srcu_usage fieldChen-Yu Tsai1-0/+10
In commit 95433f726301 ("srcu: Begin offloading srcu_struct fields to srcu_update"), a new struct srcu_usage field was added, but was not properly initialized. This led to a "spinlock bad magic" BUG when the SRCU notifier was ever used. This was observed in the MediaTek CCI devfreq driver on next-20230525. The trimmed stack trace is as follows: BUG: spinlock bad magic on CPU#4, swapper/0/1 lock: 0xffffff80ff529ac0, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0 Call trace: spin_bug+0xa4/0xe8 do_raw_spin_lock+0xec/0x120 _raw_spin_lock_irqsave+0x78/0xb8 synchronize_srcu+0x3c/0x168 srcu_notifier_chain_unregister+0x5c/0xa0 cpufreq_unregister_notifier+0x94/0xe0 devfreq_passive_event_handler+0x7c/0x3e0 devfreq_remove_device+0x48/0xe8 Add __SRCU_USAGE_INIT() to SRCU_NOTIFIER_INIT() so that srcu_usage gets initialized properly. Reported-by: Jon Hunter <jonathanh@nvidia.com> Fixes: 95433f726301 ("srcu: Begin offloading srcu_struct fields to srcu_update") Signed-off-by: Chen-Yu Tsai <wenst@chromium.org> Tested-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Cc: Matthias Brugger <matthias.bgg@gmail.com> Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> Cc: "Michał Mirosław" <mirq-linux@rere.qmqm.pl> Cc: Dmitry Osipenko <dmitry.osipenko@collabora.com> Cc: Sachin Sant <sachinp@linux.ibm.com> Cc: Joel Fernandes (Google) <joel@joelfernandes.org Tested-by: Jon Hunter <jonathanh@nvidia.com> Acked-by: Zqiang <qiang.zhang1211@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2023-06-07net: sched: move rtm_tca_policy declaration to include fileEric Dumazet1-0/+2
rtm_tca_policy is used from net/sched/sch_api.c and net/sched/cls_api.c, thus should be declared in an include file. This fixes the following sparse warning: net/sched/sch_api.c:1434:25: warning: symbol 'rtm_tca_policy' was not declared. Should it be static? Fixes: e331473fee3d ("net/sched: cls_api: add missing validation of netlink attributes") Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-07net: sched: add rcu annotations around qdisc->qdisc_sleepingEric Dumazet2-3/+5
syzbot reported a race around qdisc->qdisc_sleeping [1] It is time we add proper annotations to reads and writes to/from qdisc->qdisc_sleeping. [1] BUG: KCSAN: data-race in dev_graft_qdisc / qdisc_lookup_rcu read to 0xffff8881286fc618 of 8 bytes by task 6928 on cpu 1: qdisc_lookup_rcu+0x192/0x2c0 net/sched/sch_api.c:331 __tcf_qdisc_find+0x74/0x3c0 net/sched/cls_api.c:1174 tc_get_tfilter+0x18f/0x990 net/sched/cls_api.c:2547 rtnetlink_rcv_msg+0x7af/0x8c0 net/core/rtnetlink.c:6386 netlink_rcv_skb+0x126/0x220 net/netlink/af_netlink.c:2546 rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:6413 netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline] netlink_unicast+0x56f/0x640 net/netlink/af_netlink.c:1365 netlink_sendmsg+0x665/0x770 net/netlink/af_netlink.c:1913 sock_sendmsg_nosec net/socket.c:724 [inline] sock_sendmsg net/socket.c:747 [inline] ____sys_sendmsg+0x375/0x4c0 net/socket.c:2503 ___sys_sendmsg net/socket.c:2557 [inline] __sys_sendmsg+0x1e3/0x270 net/socket.c:2586 __do_sys_sendmsg net/socket.c:2595 [inline] __se_sys_sendmsg net/socket.c:2593 [inline] __x64_sys_sendmsg+0x46/0x50 net/socket.c:2593 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd write to 0xffff8881286fc618 of 8 bytes by task 6912 on cpu 0: dev_graft_qdisc+0x4f/0x80 net/sched/sch_generic.c:1115 qdisc_graft+0x7d0/0xb60 net/sched/sch_api.c:1103 tc_modify_qdisc+0x712/0xf10 net/sched/sch_api.c:1693 rtnetlink_rcv_msg+0x807/0x8c0 net/core/rtnetlink.c:6395 netlink_rcv_skb+0x126/0x220 net/netlink/af_netlink.c:2546 rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:6413 netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline] netlink_unicast+0x56f/0x640 net/netlink/af_netlink.c:1365 netlink_sendmsg+0x665/0x770 net/netlink/af_netlink.c:1913 sock_sendmsg_nosec net/socket.c:724 [inline] sock_sendmsg net/socket.c:747 [inline] ____sys_sendmsg+0x375/0x4c0 net/socket.c:2503 ___sys_sendmsg net/socket.c:2557 [inline] __sys_sendmsg+0x1e3/0x270 net/socket.c:2586 __do_sys_sendmsg net/socket.c:2595 [inline] __se_sys_sendmsg net/socket.c:2593 [inline] __x64_sys_sendmsg+0x46/0x50 net/socket.c:2593 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 6912 Comm: syz-executor.5 Not tainted 6.4.0-rc3-syzkaller-00190-g0d85b27b0cc6 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/16/2023 Fixes: 3a7d0d07a386 ("net: sched: extend Qdisc with rcu") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Vlad Buslov <vladbu@nvidia.com> Acked-by: Jamal Hadi Salim<jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-07rfs: annotate lockless accesses to RFS sock flow tableEric Dumazet1-2/+5
Add READ_ONCE()/WRITE_ONCE() on accesses to the sock flow table. This also prevents a (smart ?) compiler to remove the condition in: if (table->ents[index] != newval) table->ents[index] = newval; We need the condition to avoid dirtying a shared cache line. Fixes: fec5e652e58f ("rfs: Receive Flow Steering") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-07rfs: annotate lockless accesses to sk->sk_rxhashEric Dumazet1-5/+13
Add READ_ONCE()/WRITE_ONCE() on accesses to sk->sk_rxhash. This also prevents a (smart ?) compiler to remove the condition in: if (sk->sk_rxhash != newval) sk->sk_rxhash = newval; We need the condition to avoid dirtying a shared cache line. Fixes: fec5e652e58f ("rfs: Receive Flow Steering") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-07Merge tag 'for-net-2023-06-05' of ↵Jakub Kicinski2-1/+4
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Luiz Augusto von Dentz says: ==================== bluetooth pull request for net: - Fixes to debugfs registration - Fix use-after-free in hci_remove_ltk/hci_remove_irk - Fixes to ISO channel support - Fix missing checks for invalid L2CAP DCID - Fix l2cap_disconnect_req deadlock - Add lock to protect HCI_UNREGISTER * tag 'for-net-2023-06-05' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth: Bluetooth: L2CAP: Add missing checks for invalid DCID Bluetooth: ISO: use correct CIS order in Set CIG Parameters event Bluetooth: ISO: don't try to remove CIG if there are bound CIS left Bluetooth: Fix l2cap_disconnect_req deadlock Bluetooth: hci_qca: fix debugfs registration Bluetooth: fix debugfs registration Bluetooth: hci_sync: add lock to protect HCI_UNREGISTER Bluetooth: Fix use-after-free in hci_remove_ltk/hci_remove_irk Bluetooth: ISO: Fix CIG auto-allocation to select configurable CIG Bluetooth: ISO: consider right CIS when removing CIG at cleanup ==================== Link: https://lore.kernel.org/r/20230606003454.2392552-1-luiz.dentz@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-07ipv6: rpl: Fix Route of Death.Kuniyuki Iwashima1-3/+0
A remote DoS vulnerability of RPL Source Routing is assigned CVE-2023-2156. The Source Routing Header (SRH) has the following format: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Next Header | Hdr Ext Len | Routing Type | Segments Left | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | CmprI | CmprE | Pad | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | . . . Addresses[1..n] . . . | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ The originator of an SRH places the first hop's IPv6 address in the IPv6 header's IPv6 Destination Address and the second hop's IPv6 address as the first address in Addresses[1..n]. The CmprI and CmprE fields indicate the number of prefix octets that are shared with the IPv6 Destination Address. When CmprI or CmprE is not 0, Addresses[1..n] are compressed as follows: 1..n-1 : (16 - CmprI) bytes n : (16 - CmprE) bytes Segments Left indicates the number of route segments remaining. When the value is not zero, the SRH is forwarded to the next hop. Its address is extracted from Addresses[n - Segment Left + 1] and swapped with IPv6 Destination Address. When Segment Left is greater than or equal to 2, the size of SRH is not changed because Addresses[1..n-1] are decompressed and recompressed with CmprI. OTOH, when Segment Left changes from 1 to 0, the new SRH could have a different size because Addresses[1..n-1] are decompressed with CmprI and recompressed with CmprE. Let's say CmprI is 15 and CmprE is 0. When we receive SRH with Segment Left >= 2, Addresses[1..n-1] have 1 byte for each, and Addresses[n] has 16 bytes. When Segment Left is 1, Addresses[1..n-1] is decompressed to 16 bytes and not recompressed. Finally, the new SRH will need more room in the header, and the size is (16 - 1) * (n - 1) bytes. Here the max value of n is 255 as Segment Left is u8, so in the worst case, we have to allocate 3825 bytes in the skb headroom. However, now we only allocate a small fixed buffer that is IPV6_RPL_SRH_WORST_SWAP_SIZE (16 + 7 bytes). If the decompressed size overflows the room, skb_push() hits BUG() below [0]. Instead of allocating the fixed buffer for every packet, let's allocate enough headroom only when we receive SRH with Segment Left 1. [0]: skbuff: skb_under_panic: text:ffffffff81c9f6e2 len:576 put:576 head:ffff8880070b5180 data:ffff8880070b4fb0 tail:0x70 end:0x140 dev:lo kernel BUG at net/core/skbuff.c:200! invalid opcode: 0000 [#1] PREEMPT SMP PTI CPU: 0 PID: 154 Comm: python3 Not tainted 6.4.0-rc4-00190-gc308e9ec0047 #7 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 RIP: 0010:skb_panic (net/core/skbuff.c:200) Code: 4f 70 50 8b 87 bc 00 00 00 50 8b 87 b8 00 00 00 50 ff b7 c8 00 00 00 4c 8b 8f c0 00 00 00 48 c7 c7 80 6e 77 82 e8 ad 8b 60 ff <0f> 0b 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 RSP: 0018:ffffc90000003da0 EFLAGS: 00000246 RAX: 0000000000000085 RBX: ffff8880058a6600 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff88807dc1c540 RDI: ffff88807dc1c540 RBP: ffffc90000003e48 R08: ffffffff82b392c8 R09: 00000000ffffdfff R10: ffffffff82a592e0 R11: ffffffff82b092e0 R12: ffff888005b1c800 R13: ffff8880070b51b8 R14: ffff888005b1ca18 R15: ffff8880070b5190 FS: 00007f4539f0b740(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055670baf3000 CR3: 0000000005b0e000 CR4: 00000000007506f0 PKRU: 55555554 Call Trace: <IRQ> skb_push (net/core/skbuff.c:210) ipv6_rthdr_rcv (./include/linux/skbuff.h:2880 net/ipv6/exthdrs.c:634 net/ipv6/exthdrs.c:718) ip6_protocol_deliver_rcu (net/ipv6/ip6_input.c:437 (discriminator 5)) ip6_input_finish (./include/linux/rcupdate.h:805 net/ipv6/ip6_input.c:483) __netif_receive_skb_one_core (net/core/dev.c:5494) process_backlog (./include/linux/rcupdate.h:805 net/core/dev.c:5934) __napi_poll (net/core/dev.c:6496) net_rx_action (net/core/dev.c:6565 net/core/dev.c:6696) __do_softirq (./arch/x86/include/asm/jump_label.h:27 ./include/linux/jump_label.h:207 ./include/trace/events/irq.h:142 kernel/softirq.c:572) do_softirq (kernel/softirq.c:472 kernel/softirq.c:459) </IRQ> <TASK> __local_bh_enable_ip (kernel/softirq.c:396) __dev_queue_xmit (net/core/dev.c:4272) ip6_finish_output2 (./include/net/neighbour.h:544 net/ipv6/ip6_output.c:134) rawv6_sendmsg (./include/net/dst.h:458 ./include/linux/netfilter.h:303 net/ipv6/raw.c:656 net/ipv6/raw.c:914) sock_sendmsg (net/socket.c:724 net/socket.c:747) __sys_sendto (net/socket.c:2144) __x64_sys_sendto (net/socket.c:2156 net/socket.c:2152 net/socket.c:2152) do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120) RIP: 0033:0x7f453a138aea Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 7e c3 0f 1f 44 00 00 41 54 48 83 ec 30 44 89 RSP: 002b:00007ffcc212a1c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 00007ffcc212a288 RCX: 00007f453a138aea RDX: 0000000000000060 RSI: 00007f4539084c20 RDI: 0000000000000003 RBP: 00007f4538308e80 R08: 00007ffcc212a300 R09: 000000000000001c R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: ffffffffc4653600 R14: 0000000000000001 R15: 00007f4539712d1b </TASK> Modules linked in: Fixes: 8610c7c6e3bd ("net: ipv6: add support for rpl sr exthdr") Reported-by: Max VA Closes: https://www.interruptlabs.co.uk/articles/linux-ipv6-route-of-death Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20230605180617.67284-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-06Merge tag 'platform-drivers-x86-v6.4-4' of ↵Linus Torvalds1-5/+1
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver fixes from Hans de Goede: - various Microsoft Surface support fixes - one fix for the INT3472 driver * tag 'platform-drivers-x86-v6.4-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86: int3472: Avoid crash in unregistering regulator gpio platform/surface: aggregator_tabletsw: Add support for book mode in POS subsystem platform/surface: aggregator_tabletsw: Add support for book mode in KIP subsystem platform/surface: aggregator: Allow completion work-items to be executed in parallel platform/surface: aggregator: Make to_ssam_device_driver() respect constness
2023-06-06Bluetooth: ISO: use correct CIS order in Set CIG Parameters eventPauli Virtanen1-1/+2
The order of CIS handle array in Set CIG Parameters response shall match the order of the CIS_ID array in the command (Core v5.3 Vol 4 Part E Sec 7.8.97). We send CIS_IDs mainly in the order of increasing CIS_ID (but with "last" CIS first if it has fixed CIG_ID). In handling of the reply, we currently assume this is also the same as the order of hci_conn in hdev->conn_hash, but that is not true. Match the correct hci_conn to the correct handle by matching them based on the CIG+CIS combination. The CIG+CIS combination shall be unique for ISO_LINK hci_conn at state >= BT_BOUND, which we maintain in hci_le_set_cig_params. Fixes: 26afbd826ee3 ("Bluetooth: Add initial implementation of CIS connections") Signed-off-by: Pauli Virtanen <pav@iki.fi> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2023-06-06Bluetooth: fix debugfs registrationJohan Hovold1-0/+1
Since commit ec6cef9cd98d ("Bluetooth: Fix SMP channel registration for unconfigured controllers") the debugfs interface for unconfigured controllers will be created when the controller is configured. There is however currently nothing preventing a controller from being configured multiple time (e.g. setting the device address using btmgmt) which results in failed attempts to register the already registered debugfs entries: debugfs: File 'features' in directory 'hci0' already present! debugfs: File 'manufacturer' in directory 'hci0' already present! debugfs: File 'hci_version' in directory 'hci0' already present! ... debugfs: File 'quirk_simultaneous_discovery' in directory 'hci0' already present! Add a controller flag to avoid trying to register the debugfs interface more than once. Fixes: ec6cef9cd98d ("Bluetooth: Fix SMP channel registration for unconfigured controllers") Cc: stable@vger.kernel.org # 4.0 Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2023-06-06Bluetooth: hci_sync: add lock to protect HCI_UNREGISTERZhengping Jiang1-0/+1
When the HCI_UNREGISTER flag is set, no jobs should be scheduled. Fix potential race when HCI_UNREGISTER is set after the flag is tested in hci_cmd_sync_queue. Fixes: 0b94f2651f56 ("Bluetooth: hci_sync: Fix queuing commands when HCI_UNREGISTER is set") Signed-off-by: Zhengping Jiang <jiangzp@google.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2023-06-06bpf: netfilter: Add BPF_NETFILTER bpf_attach_typeFlorian Westphal1-0/+1
Andrii Nakryiko writes: And we currently don't have an attach type for NETLINK BPF link. Thankfully it's not too late to add it. I see that link_create() in kernel/bpf/syscall.c just bypasses attach_type check. We shouldn't have done that. Instead we need to add BPF_NETLINK attach type to enum bpf_attach_type. And wire all that properly throughout the kernel and libbpf itself. This adds BPF_NETFILTER and uses it. This breaks uabi but this wasn't in any non-rc release yet, so it should be fine. v2: check link_attack prog type in link_create too Fixes: 84601d6ee68a ("bpf: add bpf_link support for BPF_NETFILTER programs") Suggested-by: Andrii Nakryiko <andrii.nakryiko@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/CAEf4BzZ69YgrQW7DHCJUT_X+GqMq_ZQQPBwopaJJVGFD5=d5Vg@mail.gmail.com/ Link: https://lore.kernel.org/bpf/20230605131445.32016-1-fw@strlen.de
2023-06-05Merge tag 'qcom-driver-fixes-for-6.4' of ↵Arnd Bergmann2-6/+9
https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into arm/fixes Qualcomm driver fixes for 6.4 Error paths is corrected across icc-bwmon, rpmh-rsc, ramp_controller and rmtfs. The ice module is renamed qcom_ice, to avoid clashing with existing "ice" driver. SA8155P-specific RPMh power-domains are introduced to avoid the code trying to access resources that exists on SM8150, but not on SA8155P. Lastly, changes to the EDAC driver to fix an issue where the driver performs mmio based on the wrong register map. * tag 'qcom-driver-fixes-for-6.4' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux: EDAC/qcom: Get rid of hardcoded register offsets EDAC/qcom: Remove superfluous return variable assignment in qcom_llcc_core_setup() dt-bindings: cache: qcom,llcc: Fix SM8550 description soc: qcom: rpmhpd: Add SA8155P power domains dt-bindings: power: qcom,rpmpd: Add SA8155P soc: qcom: Rename ice to qcom_ice to avoid module name conflict soc: qcom: rmtfs: Fix error code in probe() soc: qcom: ramp_controller: Fix an error handling path in qcom_ramp_controller_probe() soc: qcom: rpmh-rsc: drop redundant unsigned >=0 comparision soc: qcom: icc-bwmon: fix incorrect error code passed to dev_err_probe() Link: https://lore.kernel.org/r/20230601141058.2246039-1-andersson@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-06-04Merge tag 'media/v6.4-4' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fixes from Mauro Carvalho Chehab: "Some driver fixes: - a regression fix for the verisilicon driver - uvcvideo: don't expose unsupported video formats to userspace - camss-video: don't zero subdev format after init - mediatek: some fixes for 4K decoder formats - fix a Sphinx build warning (missing doc for client_caps) - some fixes for imx and atomisp staging drivers And two CEC core fixes: - don't set last_initiator if TX in progress - disable adapter in cec_devnode_unregister" * tag 'media/v6.4-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: media: uvcvideo: Don't expose unsupported formats to userspace media: v4l2-subdev: Fix missing kerneldoc for client_caps media: staging: media: imx: initialize hs_settle to avoid warning media: v4l2-mc: Drop subdev check in v4l2_create_fwnode_links_to_pad() media: staging: media: atomisp: init high & low vars media: cec: core: don't set last_initiator if tx in progress media: cec: core: disable adapter in cec_devnode_unregister media: mediatek: vcodec: Only apply 4K frame sizes on decoder formats media: camss: camss-video: Don't zero subdev format again after initialization media: verisilicon: Additional fix for the crash when opening the driver
2023-06-04Merge tag 'char-misc-6.4-rc5' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver fixes from Greg KH: "Here are a bunch of tiny char/misc/other driver fixes for 6.4-rc5 that resolve a number of reported issues. Included in here are: - iio driver fixes - fpga driver fixes - test_firmware bugfixes - fastrpc driver tiny bugfixes - MAINTAINERS file updates for some subsystems All of these have been in linux-next this past week with no reported issues" * tag 'char-misc-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (34 commits) test_firmware: fix the memory leak of the allocated firmware buffer test_firmware: fix a memory leak with reqs buffer test_firmware: prevent race conditions by a correct implementation of locking firmware_loader: Fix a NULL vs IS_ERR() check MAINTAINERS: Vaibhav Gupta is the new ipack maintainer dt-bindings: fpga: replace Ivan Bornyakov maintainership MAINTAINERS: update Microchip MPF FPGA reviewers misc: fastrpc: reject new invocations during device removal misc: fastrpc: return -EPIPE to invocations on device removal misc: fastrpc: Reassign memory ownership only for remote heap misc: fastrpc: Pass proper scm arguments for secure map request iio: imu: inv_icm42600: fix timestamp reset iio: adc: ad_sigma_delta: Fix IRQ issue by setting IRQ_DISABLE_UNLAZY flag dt-bindings: iio: adc: renesas,rcar-gyroadc: Fix adi,ad7476 compatible value iio: dac: mcp4725: Fix i2c_master_send() return value handling iio: accel: kx022a fix irq getting iio: bu27034: Ensure reset is written iio: dac: build ad5758 driver when AD5758 is selected iio: addac: ad74413: fix resistance input processing iio: light: vcnl4035: fixed chip ID check ...
2023-06-04Merge tag 'usb-6.4-rc5' of ↵Linus Torvalds2-0/+11
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are some USB driver and core fixes for 6.4-rc5. Most of these are tiny driver fixes, including: - udc driver bugfix - f_fs gadget driver bugfix - cdns3 driver bugfix - typec bugfixes But the "big" thing in here is a fix yet-again for how the USB buffers are handled from userspace when dealing with DMA issues. The changes were discussed a lot, and tested a lot, on the list, and acked by the relevant mm maintainers and have been in linux-next all this past week with no reported problems" * tag 'usb-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: usb: typec: tps6598x: Fix broken polling mode after system suspend/resume mm: page_table_check: Ensure user pages are not slab pages mm: page_table_check: Make it dependent on EXCLUSIVE_SYSTEM_RAM usb: usbfs: Use consistent mmap functions usb: usbfs: Enforce page requirements for mmap dt-bindings: usb: snps,dwc3: Fix "snps,hsphy_interface" type usb: gadget: udc: fix NULL dereference in remove() usb: gadget: f_fs: Add unbind event before functionfs_unbind usb: cdns3: fix NCM gadget RX speed 20x slow than expection at iMX8QM
2023-06-03Merge tag 'scsi-fixes' of ↵Linus Torvalds1-3/+4
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Five fixes, all in drivers. The most extensive is the target change to fix the hang in the login code, which involves changing timers from per login to per connection" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: stex: Fix gcc 13 warnings scsi: qla2xxx: Fix NULL pointer dereference in target mode scsi: target: iscsi: Prevent login threads from racing between each other scsi: target: iscsi: Remove unused transport_timer scsi: target: iscsi: Fix hang in the iSCSI login code
2023-06-03net/ipv6: convert skip_notify_on_dev_down sysctl to u8Eric Dumazet1-1/+1
Save a bit a space, and could help future sysctls to use the same pattern. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-03net/ipv6: fix bool/int mismatch for skip_notify_on_dev_downEric Dumazet1-1/+1
skip_notify_on_dev_down ctl table expects this field to be an int (4 bytes), not a bool (1 byte). Because proc_dou8vec_minmax() was added in 5.13, this patch converts skip_notify_on_dev_down to an int. Following patch then converts the field to u8 and use proc_dou8vec_minmax(). Fixes: 7c6bb7d2faaf ("net/ipv6: Add knob to skip DELROUTE message on device down") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-02media: v4l2-subdev: Fix missing kerneldoc for client_capsTomi Valkeinen1-0/+1
Add missing kernel doc for the new 'client_caps' field in struct v4l2_subdev_fh. Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Fixes: f57fa2959244 ("media: v4l2-subdev: Add new ioctl for client capabilities") Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
2023-06-02Merge tag 'nfsd-6.4-2' of ↵Linus Torvalds1-4/+3
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: - Two minor bug fixes * tag 'nfsd-6.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: nfsd: fix double fget() bug in __write_ports_addfd() nfsd: make a copy of struct iattr before calling notify_change
2023-06-02net/ipv4: ping_group_range: allow GID from 2147483648 to 4294967294Akihiro Suda1-5/+1
With this commit, all the GIDs ("0 4294967294") can be written to the "net.ipv4.ping_group_range" sysctl. Note that 4294967295 (0xffffffff) is an invalid GID (see gid_valid() in include/linux/uidgid.h), and an attempt to register this number will cause -EINVAL. Prior to this commit, only up to GID 2147483647 could be covered. Documentation/networking/ip-sysctl.rst had "0 4294967295" as an example value, but this example was wrong and causing -EINVAL. Fixes: c319b4d76b9e ("net: ipv4: add IPPROTO_ICMP socket kind") Co-developed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-02neighbour: fix unaligned access to pneigh_entryQingfang DENG1-1/+1
After the blamed commit, the member key is longer 4-byte aligned. On platforms that do not support unaligned access, e.g., MIPS32R2 with unaligned_action set to 1, this will trigger a crash when accessing an IPv6 pneigh_entry, as the key is cast to an in6_addr pointer. Change the type of the key to u32 to make it aligned. Fixes: 62dd93181aaa ("[IPV6] NDISC: Set per-entry is_router flag in Proxy NA.") Signed-off-by: Qingfang DENG <qingfang.deng@siflower.com.cn> Link: https://lore.kernel.org/r/20230601015432.159066-1-dqfext@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-02Merge tag 'efi-fixes-for-v6.4-1' of ↵Linus Torvalds3-12/+21
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI fixes from Ard Biesheuvel: "A few minor fixes for EFI, one of which fixes the reported boot regression when booting x86 kernels using the BIOS based loader built into the hypervisor framework on macOS. - fix harmless warning in zboot code on 'make clean' - add some missing prototypes - fix boot regressions triggered by PE/COFF header image minor version bump" * tag 'efi-fixes-for-v6.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: efi: Bump stub image version for macOS HVF compatibility efi: fix missing prototype warnings efi/libstub: zboot: Avoid eager evaluation of objcopy flags
2023-06-02Merge tag 'net-6.4-rc5' of ↵Linus Torvalds4-2/+6
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Happy Wear a Dress Day. Fairly standard-sized batch of fixes, accounting for the lack of sub-tree submissions this week. The mlx5 IRQ fixes are notable, people were complaining about that. No fires burning. Current release - regressions: - eth: mlx5e: - multiple fixes for dynamic IRQ allocation - prevent encap offload when neigh update is running - eth: mana: fix perf regression: remove rx_cqes, tx_cqes counters Current release - new code bugs: - eth: mlx5e: DR, add missing mutex init/destroy in pattern manager Previous releases - always broken: - tcp: deny tcp_disconnect() when threads are waiting - sched: prevent ingress Qdiscs from getting installed in random locations in the hierarchy and moving around - sched: flower: fix possible OOB write in fl_set_geneve_opt() - netlink: fix NETLINK_LIST_MEMBERSHIPS length report - udp6: fix race condition in udp6_sendmsg & connect - tcp: fix mishandling when the sack compression is deferred - rtnetlink: validate link attributes set at creation time - mptcp: fix connect timeout handling - eth: stmmac: fix call trace when stmmac_xdp_xmit() is invoked - eth: amd-xgbe: fix the false linkup in xgbe_phy_status - eth: mlx5e: - fix corner cases in internal buffer configuration - drain health before unregistering devlink - usb: qmi_wwan: set DTR quirk for BroadMobi BM818 Misc: - tcp: return user_mss for TCP_MAXSEG in CLOSE/LISTEN state if user_mss set" * tag 'net-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (71 commits) mptcp: fix active subflow finalization mptcp: add annotations around sk->sk_shutdown accesses mptcp: fix data race around msk->first access mptcp: consolidate passive msk socket initialization mptcp: add annotations around msk->subflow accesses mptcp: fix connect timeout handling rtnetlink: add the missing IFLA_GRO_ tb check in validate_linkmsg rtnetlink: move IFLA_GSO_ tb check to validate_linkmsg rtnetlink: call validate_linkmsg in rtnl_create_link ice: recycle/free all of the fragments from multi-buffer frame net: phy: mxl-gpy: extend interrupt fix to all impacted variants net: renesas: rswitch: Fix return value in error path of xmit net: dsa: mv88e6xxx: Increase wait after reset deactivation net: ipa: Use correct value for IPA_STATUS_SIZE tcp: fix mishandling when the sack compression is deferred. net/sched: flower: fix possible OOB write in fl_set_geneve_opt() sfc: fix error unwinds in TC offload net/mlx5: Read embedded cpu after init bit cleared net/mlx5e: Fix error handling in mlx5e_refresh_tirs net/mlx5: Ensure af_desc.mask is properly initialized ...
2023-06-02fork, vhost: Use CLONE_THREAD to fix freezer/ps regressionMike Christie2-13/+3
When switching from kthreads to vhost_tasks two bugs were added: 1. The vhost worker tasks's now show up as processes so scripts doing ps or ps a would not incorrectly detect the vhost task as another process. 2. kthreads disabled freeze by setting PF_NOFREEZE, but vhost tasks's didn't disable or add support for them. To fix both bugs, this switches the vhost task to be thread in the process that does the VHOST_SET_OWNER ioctl, and has vhost_worker call get_signal to support SIGKILL/SIGSTOP and freeze signals. Note that SIGKILL/STOP support is required because CLONE_THREAD requires CLONE_SIGHAND which requires those 2 signals to be supported. This is a modified version of the patch written by Mike Christie <michael.christie@oracle.com> which was a modified version of patch originally written by Linus. Much of what depended upon PF_IO_WORKER now depends on PF_USER_WORKER. Including ignoring signals, setting up the register state, and having get_signal return instead of calling do_group_exit. Tidied up the vhost_task abstraction so that the definition of vhost_task only needs to be visible inside of vhost_task.c. Making it easier to review the code and tell what needs to be done where. As part of this the main loop has been moved from vhost_worker into vhost_task_fn. vhost_worker now returns true if work was done. The main loop has been updated to call get_signal which handles SIGSTOP, freezing, and collects the message that tells the thread to exit as part of process exit. This collection clears __fatal_signal_pending. This collection is not guaranteed to clear signal_pending() so clear that explicitly so the schedule() sleeps. For now the vhost thread continues to exist and run work until the last file descriptor is closed and the release function is called as part of freeing struct file. To avoid hangs in the coredump rendezvous and when killing threads in a multi-threaded exec. The coredump code and de_thread have been modified to ignore vhost threads. Remvoing the special case for exec appears to require teaching vhost_dev_flush how to directly complete transactions in case the vhost thread is no longer running. Removing the special case for coredump rendezvous requires either the above fix needed for exec or moving the coredump rendezvous into get_signal. Fixes: 6e890c5d5021 ("vhost: use vhost_tasks for worker threads") Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Co-developed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-06-01Merge tag 'firewire-fixes-6.4-rc4' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394 Pull firewire fix from Takashi Sakamoto: "A single patch to use a flexible array rather than a zero-length one" * tag 'firewire-fixes-6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394: firewire: Replace zero-length array with flexible-array member
2023-06-01firewire: Replace zero-length array with flexible-array memberGustavo A. R. Silva1-1/+1
Zero-length and one-element arrays are deprecated, and we are moving towards adopting C99 flexible-array members, instead. Address the following warnings found with GCC-13 and -fstrict-flex-arrays=3 enabled: sound/firewire/amdtp-stream.c: In function ‘build_it_pkt_header’: sound/firewire/amdtp-stream.c:694:17: warning: ‘generate_cip_header’ accessing 8 bytes in a region of size 0 [-Wstringop-overflow=] 694 | generate_cip_header(s, cip_header, data_block_counter, syt); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ sound/firewire/amdtp-stream.c:694:17: note: referencing argument 2 of type ‘__be32[2]’ {aka ‘unsigned int[2]’} sound/firewire/amdtp-stream.c:667:13: note: in a call to function ‘generate_cip_header’ 667 | static void generate_cip_header(struct amdtp_stream *s, __be32 cip_header[2], | ^~~~~~~~~~~~~~~~~~~ This helps with the ongoing efforts to tighten the FORTIFY_SOURCE routines on memcpy() and help us make progress towards globally enabling -fstrict-flex-arrays=3 [1]. Link: https://github.com/KSPP/linux/issues/21 Link: https://github.com/KSPP/linux/issues/303 Link: https://gcc.gnu.org/pipermail/gcc-patches/2022-October/602902.html [1] Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/ZHT0V3SpvHyxCv5W@work Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
2023-06-01tcp: fix mishandling when the sack compression is deferred.fuyuanli1-0/+1
In this patch, we mainly try to handle sending a compressed ack correctly if it's deferred. Here are more details in the old logic: When sack compression is triggered in the tcp_compressed_ack_kick(), if the sock is owned by user, it will set TCP_DELACK_TIMER_DEFERRED and then defer to the release cb phrase. Later once user releases the sock, tcp_delack_timer_handler() should send a ack as expected, which, however, cannot happen due to lack of ICSK_ACK_TIMER flag. Therefore, the receiver would not sent an ack until the sender's retransmission timeout. It definitely increases unnecessary latency. Fixes: 5d9f4262b7ea ("tcp: add SACK compression") Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: fuyuanli <fuyuanli@didiglobal.com> Signed-off-by: Jason Xing <kerneljasonxing@gmail.com> Link: https://lore.kernel.org/netdev/20230529113804.GA20300@didi-ThinkCentre-M920t-N000/ Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20230531080150.GA20424@didi-ThinkCentre-M920t-N000 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-31nfsd: fix double fget() bug in __write_ports_addfd()Dan Carpenter1-4/+3
The bug here is that you cannot rely on getting the same socket from multiple calls to fget() because userspace can influence that. This is a kind of double fetch bug. The fix is to delete the svc_alien_sock() function and instead do the checking inside the svc_addsock() function. Fixes: 3064639423c4 ("nfsd: check passed socket's net matches NFSd superblock's one") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: NeilBrown <neilb@suse.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-05-30net: mana: Fix perf regression: remove rx_cqes, tx_cqes countersHaiyang Zhang1-2/+0
The apc->eth_stats.rx_cqes is one per NIC (vport), and it's on the frequent and parallel code path of all queues. So, r/w into this single shared variable by many threads on different CPUs creates a lot caching and memory overhead, hence perf regression. And, it's not accurate due to the high volume concurrent r/w. For example, a workload is iperf with 128 threads, and with RPS enabled. We saw perf regression of 25% with the previous patch adding the counters. And this patch eliminates the regression. Since the error path of mana_poll_rx_cq() already has warnings, so keeping the counter and convert it to a per-queue variable is not necessary. So, just remove this counter from this high frequency code path. Also, remove the tx_cqes counter for the same reason. We have warnings & other counters for errors on that path, and don't need to count every normal cqe processing. Cc: stable@vger.kernel.org Fixes: bd7fc6e1957c ("net: mana: Add new MANA VF performance counters for easier troubleshooting") Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/1685115537-31675-1-git-send-email-haiyangz@microsoft.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-30platform/surface: aggregator: Make to_ssam_device_driver() respect constnessMaximilian Luz1-5/+1
Make to_ssam_device_driver() a bit safer by replacing container_of() with container_of_const() to respect the constness of the passed in pointer, instead of silently discarding any const specifications. This change also makes it more similar to to_ssam_device(), which already uses container_of_const(). Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com> Link: https://lore.kernel.org/r/20230525205041.2774947-1-luzmaximilian@gmail.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-05-30tcp: deny tcp_disconnect() when threads are waitingEric Dumazet1-0/+4
Historically connect(AF_UNSPEC) has been abused by syzkaller and other fuzzers to trigger various bugs. A recent one triggers a divide-by-zero [1], and Paolo Abeni was able to diagnose the issue. tcp_recvmsg_locked() has tests about sk_state being not TCP_LISTEN and TCP REPAIR mode being not used. Then later if socket lock is released in sk_wait_data(), another thread can call connect(AF_UNSPEC), then make this socket a TCP listener. When recvmsg() is resumed, it can eventually call tcp_cleanup_rbuf() and attempt a divide by 0 in tcp_rcv_space_adjust() [1] This patch adds a new socket field, counting number of threads blocked in sk_wait_event() and inet_wait_for_connect(). If this counter is not zero, tcp_disconnect() returns an error. This patch adds code in blocking socket system calls, thus should not hurt performance of non blocking ones. Note that we probably could revert commit 499350a5a6e7 ("tcp: initialize rcv_mss to TCP_MIN_MSS instead of 0") to restore original tcpi_rcv_mss meaning (was 0 if no payload was ever received on a socket) [1] divide error: 0000 [#1] PREEMPT SMP KASAN CPU: 0 PID: 13832 Comm: syz-executor.5 Not tainted 6.3.0-rc4-syzkaller-00224-g00c7b5f4ddc5 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/02/2023 RIP: 0010:tcp_rcv_space_adjust+0x36e/0x9d0 net/ipv4/tcp_input.c:740 Code: 00 00 00 00 fc ff df 4c 89 64 24 48 8b 44 24 04 44 89 f9 41 81 c7 80 03 00 00 c1 e1 04 44 29 f0 48 63 c9 48 01 e9 48 0f af c1 <49> f7 f6 48 8d 04 41 48 89 44 24 40 48 8b 44 24 30 48 c1 e8 03 48 RSP: 0018:ffffc900033af660 EFLAGS: 00010206 RAX: 4a66b76cbade2c48 RBX: ffff888076640cc0 RCX: 00000000c334e4ac RDX: 0000000000000000 RSI: dffffc0000000000 RDI: 0000000000000001 RBP: 00000000c324e86c R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffff8880766417f8 R13: ffff888028fbb980 R14: 0000000000000000 R15: 0000000000010344 FS: 00007f5bffbfe700(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001b32f25000 CR3: 000000007ced0000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> tcp_recvmsg_locked+0x100e/0x22e0 net/ipv4/tcp.c:2616 tcp_recvmsg+0x117/0x620 net/ipv4/tcp.c:2681 inet6_recvmsg+0x114/0x640 net/ipv6/af_inet6.c:670 sock_recvmsg_nosec net/socket.c:1017 [inline] sock_recvmsg+0xe2/0x160 net/socket.c:1038 ____sys_recvmsg+0x210/0x5a0 net/socket.c:2720 ___sys_recvmsg+0xf2/0x180 net/socket.c:2762 do_recvmmsg+0x25e/0x6e0 net/socket.c:2856 __sys_recvmmsg net/socket.c:2935 [inline] __do_sys_recvmmsg net/socket.c:2958 [inline] __se_sys_recvmmsg net/socket.c:2951 [inline] __x64_sys_recvmmsg+0x20f/0x260 net/socket.c:2951 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7f5c0108c0f9 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f5bffbfe168 EFLAGS: 00000246 ORIG_RAX: 000000000000012b RAX: ffffffffffffffda RBX: 00007f5c011ac050 RCX: 00007f5c0108c0f9 RDX: 0000000000000001 RSI: 0000000020000bc0 RDI: 0000000000000003 RBP: 00007f5c010e7b39 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000122 R11: 0000000000000246 R12: 0000000000000000 R13: 00007f5c012cfb1f R14: 00007f5bffbfe300 R15: 0000000000022000 </TASK> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: syzbot <syzkaller@googlegroups.com> Reported-by: Paolo Abeni <pabeni@redhat.com> Diagnosed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/20230526163458.2880232-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>