summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)AuthorFilesLines
2023-02-24net: fix __dev_kfree_skb_any() vs drop monitorEric Dumazet1-1/+3
dev_kfree_skb() is aliased to consume_skb(). When a driver is dropping a packet by calling dev_kfree_skb_any() we should propagate the drop reason instead of pretending the packet was consumed. Note: Now we have enum skb_drop_reason we could remove enum skb_free_reason (for linux-6.4) v2: added an unlikely(), suggested by Yunsheng Lin. Fixes: e6247027e517 ("net: introduce dev_consume_skb_any()") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Yunsheng Lin <linyunsheng@huawei.com> Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-24Merge tag 'mm-nonmm-stable-2023-02-20-15-29' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: "There is no particular theme here - mainly quick hits all over the tree. Most notable is a set of zlib changes from Mikhail Zaslonko which enhances and fixes zlib's use of S390 hardware support: 'lib/zlib: Set of s390 DFLTCC related patches for kernel zlib'" * tag 'mm-nonmm-stable-2023-02-20-15-29' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (55 commits) Update CREDITS file entry for Jesper Juhl sparc: allow PM configs for sparc32 COMPILE_TEST hung_task: print message when hung_task_warnings gets down to zero. arch/Kconfig: fix indentation scripts/tags.sh: fix the Kconfig tags generation when using latest ctags nilfs2: prevent WARNING in nilfs_dat_commit_end() lib/zlib: remove redundation assignement of avail_in dfltcc_gdht() lib/Kconfig.debug: do not enable DEBUG_PREEMPT by default lib/zlib: DFLTCC always switch to software inflate for Z_PACKET_FLUSH option lib/zlib: DFLTCC support inflate with small window lib/zlib: Split deflate and inflate states for DFLTCC lib/zlib: DFLTCC not writing header bits when avail_out == 0 lib/zlib: fix DFLTCC ignoring flush modes when avail_in == 0 lib/zlib: fix DFLTCC not flushing EOBS when creating raw streams lib/zlib: implement switching between DFLTCC and software lib/zlib: adjust offset calculation for dfltcc_state nilfs2: replace WARN_ONs for invalid DAT metadata block requests scripts/spelling.txt: add "exsits" pattern and fix typo instances fs: gracefully handle ->get_block not mapping bh in __mpage_writepage cramfs: Kconfig: fix spelling & punctuation ...
2023-02-24Merge tag 'mm-stable-2023-02-20-13-37' of ↵Linus Torvalds1-5/+6
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - Daniel Verkamp has contributed a memfd series ("mm/memfd: add F_SEAL_EXEC") which permits the setting of the memfd execute bit at memfd creation time, with the option of sealing the state of the X bit. - Peter Xu adds a patch series ("mm/hugetlb: Make huge_pte_offset() thread-safe for pmd unshare") which addresses a rare race condition related to PMD unsharing. - Several folioification patch serieses from Matthew Wilcox, Vishal Moola, Sidhartha Kumar and Lorenzo Stoakes - Johannes Weiner has a series ("mm: push down lock_page_memcg()") which does perform some memcg maintenance and cleanup work. - SeongJae Park has added DAMOS filtering to DAMON, with the series "mm/damon/core: implement damos filter". These filters provide users with finer-grained control over DAMOS's actions. SeongJae has also done some DAMON cleanup work. - Kairui Song adds a series ("Clean up and fixes for swap"). - Vernon Yang contributed the series "Clean up and refinement for maple tree". - Yu Zhao has contributed the "mm: multi-gen LRU: memcg LRU" series. It adds to MGLRU an LRU of memcgs, to improve the scalability of global reclaim. - David Hildenbrand has added some userfaultfd cleanup work in the series "mm: uffd-wp + change_protection() cleanups". - Christoph Hellwig has removed the generic_writepages() library function in the series "remove generic_writepages". - Baolin Wang has performed some maintenance on the compaction code in his series "Some small improvements for compaction". - Sidhartha Kumar is doing some maintenance work on struct page in his series "Get rid of tail page fields". - David Hildenbrand contributed some cleanup, bugfixing and generalization of pte management and of pte debugging in his series "mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on all architectures with swap PTEs". - Mel Gorman and Neil Brown have removed the __GFP_ATOMIC allocation flag in the series "Discard __GFP_ATOMIC". - Sergey Senozhatsky has improved zsmalloc's memory utilization with his series "zsmalloc: make zspage chain size configurable". - Joey Gouly has added prctl() support for prohibiting the creation of writeable+executable mappings. The previous BPF-based approach had shortcomings. See "mm: In-kernel support for memory-deny-write-execute (MDWE)". - Waiman Long did some kmemleak cleanup and bugfixing in the series "mm/kmemleak: Simplify kmemleak_cond_resched() & fix UAF". - T.J. Alumbaugh has contributed some MGLRU cleanup work in his series "mm: multi-gen LRU: improve". - Jiaqi Yan has provided some enhancements to our memory error statistics reporting, mainly by presenting the statistics on a per-node basis. See the series "Introduce per NUMA node memory error statistics". - Mel Gorman has a second and hopefully final shot at fixing a CPU-hog regression in compaction via his series "Fix excessive CPU usage during compaction". - Christoph Hellwig does some vmalloc maintenance work in the series "cleanup vfree and vunmap". - Christoph Hellwig has removed block_device_operations.rw_page() in ths series "remove ->rw_page". - We get some maple_tree improvements and cleanups in Liam Howlett's series "VMA tree type safety and remove __vma_adjust()". - Suren Baghdasaryan has done some work on the maintainability of our vm_flags handling in the series "introduce vm_flags modifier functions". - Some pagemap cleanup and generalization work in Mike Rapoport's series "mm, arch: add generic implementation of pfn_valid() for FLATMEM" and "fixups for generic implementation of pfn_valid()" - Baoquan He has done some work to make /proc/vmallocinfo and /proc/kcore better represent the real state of things in his series "mm/vmalloc.c: allow vread() to read out vm_map_ram areas". - Jason Gunthorpe rationalized the GUP system's interface to the rest of the kernel in the series "Simplify the external interface for GUP". - SeongJae Park wishes to migrate people from DAMON's debugfs interface over to its sysfs interface. To support this, we'll temporarily be printing warnings when people use the debugfs interface. See the series "mm/damon: deprecate DAMON debugfs interface". - Andrey Konovalov provided the accurately named "lib/stackdepot: fixes and clean-ups" series. - Huang Ying has provided a dramatic reduction in migration's TLB flush IPI rates with the series "migrate_pages(): batch TLB flushing". - Arnd Bergmann has some objtool fixups in "objtool warning fixes". * tag 'mm-stable-2023-02-20-13-37' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (505 commits) include/linux/migrate.h: remove unneeded externs mm/memory_hotplug: cleanup return value handing in do_migrate_range() mm/uffd: fix comment in handling pte markers mm: change to return bool for isolate_movable_page() mm: hugetlb: change to return bool for isolate_hugetlb() mm: change to return bool for isolate_lru_page() mm: change to return bool for folio_isolate_lru() objtool: add UACCESS exceptions for __tsan_volatile_read/write kmsan: disable ftrace in kmsan core code kasan: mark addr_has_metadata __always_inline mm: memcontrol: rename memcg_kmem_enabled() sh: initialize max_mapnr m68k/nommu: add missing definition of ARCH_PFN_OFFSET mm: percpu: fix incorrect size in pcpu_obj_full_size() maple_tree: reduce stack usage with gcc-9 and earlier mm: page_alloc: call panic() when memoryless node allocation fails mm: multi-gen LRU: avoid futile retries migrate_pages: move THP/hugetlb migration support check to simplify code migrate_pages: batch flushing TLB migrate_pages: share more code between _unmap and _move ...
2023-02-24net/9p: Adjust maximum MSIZE to account for p9 headerEric Van Hensbergen1-1/+5
Add maximum p9 header size to MSIZE to make sure we can have page aligned data. Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org> Reviewed-by: Dominique Martinet <asmadeus@codewreck.org>
2023-02-23sctp: add a refcnt in sctp_stream_priorities to avoid a nested loopXin Long1-31/+21
With this refcnt added in sctp_stream_priorities, we don't need to traverse all streams to check if the prio is used by other streams when freeing one stream's prio in sctp_sched_prio_free_sid(). This can avoid a nested loop (up to 65535 * 65535), which may cause a stuck as Ying reported: watchdog: BUG: soft lockup - CPU#23 stuck for 26s! [ksoftirqd/23:136] Call Trace: <TASK> sctp_sched_prio_free_sid+0xab/0x100 [sctp] sctp_stream_free_ext+0x64/0xa0 [sctp] sctp_stream_free+0x31/0x50 [sctp] sctp_association_free+0xa5/0x200 [sctp] Note that it doesn't need to use refcount_t type for this counter, as its accessing is always protected under the sock lock. v1->v2: - add a check in sctp_sched_prio_set to avoid the possible prio_head refcnt overflow. Fixes: 9ed7bfc79542 ("sctp: fix memory leak in sctp_stream_outq_migrate()") Reported-by: Ying Xu <yinxu@redhat.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Link: https://lore.kernel.org/r/825eb0c905cb864991eba335f4a2b780e543f06b.1677085641.git.lucien.xin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-23ipv6: Add lwtunnel encap size of all siblings in nexthop calculationLu Wei1-5/+6
In function rt6_nlmsg_size(), the length of nexthop is calculated by multipling the nexthop length of fib6_info and the number of siblings. However if the fib6_info has no lwtunnel but the siblings have lwtunnels, the nexthop length is less than it should be, and it will trigger a warning in inet6_rt_notify() as follows: WARNING: CPU: 0 PID: 6082 at net/ipv6/route.c:6180 inet6_rt_notify+0x120/0x130 ...... Call Trace: <TASK> fib6_add_rt2node+0x685/0xa30 fib6_add+0x96/0x1b0 ip6_route_add+0x50/0xd0 inet6_rtm_newroute+0x97/0xa0 rtnetlink_rcv_msg+0x156/0x3d0 netlink_rcv_skb+0x5a/0x110 netlink_unicast+0x246/0x350 netlink_sendmsg+0x250/0x4c0 sock_sendmsg+0x66/0x70 ___sys_sendmsg+0x7c/0xd0 __sys_sendmsg+0x5d/0xb0 do_syscall_64+0x3f/0x90 entry_SYSCALL_64_after_hwframe+0x72/0xdc This bug can be reproduced by script: ip -6 addr add 2002::2/64 dev ens2 ip -6 route add 100::/64 via 2002::1 dev ens2 metric 100 for i in 10 20 30 40 50 60 70; do ip link add link ens2 name ipv_$i type ipvlan ip -6 addr add 2002::$i/64 dev ipv_$i ifconfig ipv_$i up done for i in 10 20 30 40 50 60; do ip -6 route append 100::/64 encap ip6 dst 2002::$i via 2002::1 dev ipv_$i metric 100 done ip -6 route append 100::/64 via 2002::1 dev ipv_70 metric 100 This patch fixes it by adding nexthop_len of every siblings using rt6_nh_nlmsg_size(). Fixes: beb1afac518d ("net: ipv6: Add support to dump multipath routes via RTA_MULTIPATH attribute") Signed-off-by: Lu Wei <luwei32@huawei.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20230222083629.335683-2-luwei32@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nfJakub Kicinski13-29/+48
Pablo Neira Ayuso says: ==================== Netfilter fixes for net 1) Fix broken listing of set elements when table has an owner. 2) Fix conntrack refcount leak in ctnetlink with related conntrack entries, from Hangyu Hua. 3) Fix use-after-free/double-free in ctnetlink conntrack insert path, from Florian Westphal. 4) Fix ip6t_rpfilter with VRF, from Phil Sutter. 5) Fix use-after-free in ebtables reported by syzbot, also from Florian. 6) Use skb->len in xt_length to deal with IPv6 jumbo packets, from Xin Long. 7) Fix NETLINK_LISTEN_ALL_NSID with ctnetlink, from Florian Westphal. 8) Fix memleak in {ip_,ip6_,arp_}tables in ENOMEM error case, from Pavel Tikhomirov. * git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: x_tables: fix percpu counter block leak on error path when creating new netns netfilter: ctnetlink: make event listener tracking global netfilter: xt_length: use skb len to match in length_mt6 netfilter: ebtables: fix table blob use-after-free netfilter: ip6t_rpfilter: Fix regression with VRF interfaces netfilter: conntrack: fix rmmod double-free race netfilter: ctnetlink: fix possible refcount leak in ctnetlink_create_conntrack() netfilter: nf_tables: allow to fetch set elements when table has an owner ==================== Link: https://lore.kernel.org/r/20230222092137.88637-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-23Merge tag 'nfs-for-6.3-1' of git://git.linux-nfs.org/projects/anna/linux-nfsLinus Torvalds2-4/+6
Pull NFS client updates from Anna Schumaker: "New Features: - Convert the read and write paths to use folios Bugfixes and Cleanups: - Fix tracepoint state manager flag printing - Fix disabling swap files - Fix NFSv4 client identifier sysfs path in the documentation - Don't clear NFS_CAP_COPY if server returns NFS4ERR_OFFLOAD_DENIED - Treat GETDEVICEINFO errors as a layout failure - Replace kmap_atomic() calls with kmap_local_page() - Constify sunrpc sysfs kobj_type structures" * tag 'nfs-for-6.3-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (25 commits) fs/nfs: Replace kmap_atomic() with kmap_local_page() in dir.c pNFS/filelayout: treat GETDEVICEINFO errors as layout failure Documentation: Fix sysfs path for the NFSv4 client identifier nfs42: do not fail with EIO if ssc returns NFS4ERR_OFFLOAD_DENIED NFS: fix disabling of swap SUNRPC: make kobj_type structures constant nfs4trace: fix state manager flag printing NFS: Remove unnecessary check in nfs_read_folio() NFS: Improve tracing of nfs_wb_folio() NFS: Enable tracing of nfs_invalidate_folio() and nfs_launder_folio() NFS: fix up nfs_release_folio() to try to release the page NFS: Clean up O_DIRECT request allocation NFS: Fix up nfs_vm_page_mkwrite() for folios NFS: Convert nfs_write_begin/end to use folios NFS: Remove unused function nfs_wb_page() NFS: Convert buffered writes to use folios NFS: Convert the function nfs_wb_page() to use folios NFS: Convert buffered reads to use folios NFS: Add a helper nfs_wb_folio() NFS: Convert the remaining pagelist helper functions to support folios ...
2023-02-23Merge tag 'nfsd-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linuxLinus Torvalds24-1329/+4758
Pull nfsd updates from Chuck Lever: "Two significant security enhancements are part of this release: - NFSD's RPC header encoding and decoding, including RPCSEC GSS and gssproxy header parsing, has been overhauled to make it more memory-safe. - Support for Kerberos AES-SHA2-based encryption types has been added for both the NFS client and server. This provides a clean path for deprecating and removing insecure encryption types based on DES and SHA-1. AES-SHA2 is also FIPS-140 compliant, so that NFS with Kerberos may now be used on systems with fips enabled. In addition to these, NFSD is now able to handle crossing into an auto-mounted mount point on an exported NFS mount. A number of fixes have been made to NFSD's server-side copy implementation. RPC metrics have been converted to per-CPU variables. This helps reduce unnecessary cross-CPU and cross-node memory bus traffic, and significantly reduces noise when KCSAN is enabled" * tag 'nfsd-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (121 commits) NFSD: Clean up nfsd_symlink() NFSD: copy the whole verifier in nfsd_copy_write_verifier nfsd: don't fsync nfsd_files on last close SUNRPC: Fix occasional warning when destroying gss_krb5_enctypes nfsd: fix courtesy client with deny mode handling in nfs4_upgrade_open NFSD: fix problems with cleanup on errors in nfsd4_copy nfsd: fix race to check ls_layouts nfsd: don't hand out delegation on setuid files being opened for write SUNRPC: Remove ->xpo_secure_port() SUNRPC: Clean up the svc_xprt_flags() macro nfsd: remove fs/nfsd/fault_inject.c NFSD: fix leaked reference count of nfsd4_ssc_umount_item nfsd: clean up potential nfsd_file refcount leaks in COPY codepath nfsd: zero out pointers after putting nfsd_files on COPY setup error SUNRPC: Fix whitespace damage in svcauth_unix.c nfsd: eliminate __nfs4_get_fd nfsd: add some kerneldoc comments for stateid preprocessing functions nfsd: eliminate find_deleg_file_locked nfsd: don't take nfsd4_copy ref for OP_OFFLOAD_STATUS SUNRPC: Add encryption self-tests ...
2023-02-22Merge tag 'for-linus-2023022201' of ↵Linus Torvalds2-3/+2
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID updates from Benjamin Tissoires: - HID-BPF infrastructure: this allows to start using HID-BPF. Note that the mechanism to ship HID-BPF program through the kernel tree is still not implemented yet (but is planned). This should be a no-op for 99% of users. Also we are gaining kselftests for the HID tree (Benjamin Tissoires) - Some UAF fixes in workers when using uhid (Pietro Borrello & Benjamin Tissoires) - Constify hid_ll_driver (Thomas Weißschuh) - Allow more custom IIO sensors through HID (Philipp Jungkamp) - Logitech HID++ fixes for scroll wheel, protocol and debug (Bastien Nocera) - Some new device support: Steam Deck (Vicki Pfau), UClogic (José Expósito), Logitech G923 Xbox Edition steering wheel (Walt Holman), EVision keyboards (Philippe Valembois) - other assorted code cleanups and fixes * tag 'for-linus-2023022201' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: (99 commits) HID: mcp-2221: prevent UAF in delayed work hid: bigben_probe(): validate report count HID: asus: use spinlock to safely schedule workers HID: asus: use spinlock to protect concurrent accesses HID: bigben: use spinlock to safely schedule workers HID: bigben_worker() remove unneeded check on report_field HID: bigben: use spinlock to protect concurrent accesses HID: logitech-hidpp: Add myself to authors HID: logitech-hidpp: Retry commands when device is busy HID: logitech-hidpp: Add more debug statements HID: Add support for Logitech G923 Xbox Edition steering wheel HID: logitech-hidpp: Add Signature M650 HID: logitech-hidpp: Remove HIDPP_QUIRK_NO_HIDINPUT quirk HID: logitech-hidpp: Don't restart communication if not necessary HID: logitech-hidpp: Add constants for HID++ 2.0 error codes Revert "HID: logitech-hidpp: add a module parameter to keep firmware gestures" HID: logitech-hidpp: Hard-code HID++ 1.0 fast scroll support HID: i2c-hid: goodix: Add mainboard-vddio-supply dt-bindings: HID: i2c-hid: goodix: Add mainboard-vddio-supply HID: i2c-hid: goodix: Stop tying the reset line to the regulator ...
2023-02-22Merge branch 'for-6.3/hid-bpf' into for-linusBenjamin Tissoires1-1/+1
Initial support of HID-BPF (Benjamin Tissoires) The history is a little long for this series, as it was intended to be sent for v6.2. However some last minute issues forced us to postpone it to v6.3. Conflicts: * drivers/hid/i2c-hid/Kconfig: commit bf7660dab30d ("HID: stop drivers from selecting CONFIG_HID") conflicts with commit 2afac81dd165 ("HID: fix I2C_HID not selected when I2C_HID_OF_ELAN is") the resolution is simple enough: just drop the "default" and "select" lines as the new commit from Arnd is doing
2023-02-22Merge branch 'for-6.3/hid-core' into for-linusBenjamin Tissoires1-2/+1
- constify hid_ll_driver (Thomas Weißschuh) - map standard Battery System Charging to upower (José Expósito) - couple of assorted fixes and new handling of HID usages (Jingyuan Liang & Ronald Tschalär)
2023-02-22netfilter: x_tables: fix percpu counter block leak on error path when ↵Pavel Tikhomirov3-0/+12
creating new netns Here is the stack where we allocate percpu counter block: +-< __alloc_percpu +-< xt_percpu_counter_alloc +-< find_check_entry # {arp,ip,ip6}_tables.c +-< translate_table And it can be leaked on this code path: +-> ip6t_register_table +-> translate_table # allocates percpu counter block +-> xt_register_table # fails there is no freeing of the counter block on xt_register_table fail. Note: xt_percpu_counter_free should be called to free it like we do in do_replace through cleanup_entry helper (or in __ip6t_unregister_table). Probability of hitting this error path is low AFAICS (xt_register_table can only return ENOMEM here, as it is not replacing anything, as we are creating new netns, and it is hard to imagine that all previous allocations succeeded and after that one in xt_register_table failed). But it's worth fixing even the rare leak. Fixes: 71ae0dff02d7 ("netfilter: xtables: use percpu rule counters") Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-02-22Merge tag 'net-next-6.3' of ↵Linus Torvalds299-13882/+13406
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core: - Add dedicated kmem_cache for typical/small skb->head, avoid having to access struct page at kfree time, and improve memory use. - Introduce sysctl to set default RPS configuration for new netdevs. - Define Netlink protocol specification format which can be used to describe messages used by each family and auto-generate parsers. Add tools for generating kernel data structures and uAPI headers. - Expose all net/core sysctls inside netns. - Remove 4s sleep in netpoll if carrier is instantly detected on boot. - Add configurable limit of MDB entries per port, and port-vlan. - Continue populating drop reasons throughout the stack. - Retire a handful of legacy Qdiscs and classifiers. Protocols: - Support IPv4 big TCP (TSO frames larger than 64kB). - Add IP_LOCAL_PORT_RANGE socket option, to control local port range on socket by socket basis. - Track and report in procfs number of MPTCP sockets used. - Support mixing IPv4 and IPv6 flows in the in-kernel MPTCP path manager. - IPv6: don't check net.ipv6.route.max_size and rely on garbage collection to free memory (similarly to IPv4). - Support Penultimate Segment Pop (PSP) flavor in SRv6 (RFC8986). - ICMP: add per-rate limit counters. - Add support for user scanning requests in ieee802154. - Remove static WEP support. - Support minimal Wi-Fi 7 Extremely High Throughput (EHT) rate reporting. - WiFi 7 EHT channel puncturing support (client & AP). BPF: - Add a rbtree data structure following the "next-gen data structure" precedent set by recently added linked list, that is, by using kfunc + kptr instead of adding a new BPF map type. - Expose XDP hints via kfuncs with initial support for RX hash and timestamp metadata. - Add BPF_F_NO_TUNNEL_KEY extension to bpf_skb_set_tunnel_key to better support decap on GRE tunnel devices not operating in collect metadata. - Improve x86 JIT's codegen for PROBE_MEM runtime error checks. - Remove the need for trace_printk_lock for bpf_trace_printk and bpf_trace_vprintk helpers. - Extend libbpf's bpf_tracing.h support for tracing arguments of kprobes/uprobes and syscall as a special case. - Significantly reduce the search time for module symbols by livepatch and BPF. - Enable cpumasks to be used as kptrs, which is useful for tracing programs tracking which tasks end up running on which CPUs in different time intervals. - Add support for BPF trampoline on s390x and riscv64. - Add capability to export the XDP features supported by the NIC. - Add __bpf_kfunc tag for marking kernel functions as kfuncs. - Add cgroup.memory=nobpf kernel parameter option to disable BPF memory accounting for container environments. Netfilter: - Remove the CLUSTERIP target. It has been marked as obsolete for years, and we still have WARN splats wrt races of the out-of-band /proc interface installed by this target. - Add 'destroy' commands to nf_tables. They are identical to the existing 'delete' commands, but do not return an error if the referenced object (set, chain, rule...) did not exist. Driver API: - Improve cpumask_local_spread() locality to help NICs set the right IRQ affinity on AMD platforms. - Separate C22 and C45 MDIO bus transactions more clearly. - Introduce new DCB table to control DSCP rewrite on egress. - Support configuration of Physical Layer Collision Avoidance (PLCA) Reconciliation Sublayer (RS) (802.3cg-2019). Modern version of shared medium Ethernet. - Support for MAC Merge layer (IEEE 802.3-2018 clause 99). Allowing preemption of low priority frames by high priority frames. - Add support for controlling MACSec offload using netlink SET. - Rework devlink instance refcounts to allow registration and de-registration under the instance lock. Split the code into multiple files, drop some of the unnecessarily granular locks and factor out common parts of netlink operation handling. - Add TX frame aggregation parameters (for USB drivers). - Add a new attr TCA_EXT_WARN_MSG to report TC (offload) warning messages with notifications for debug. - Allow offloading of UDP NEW connections via act_ct. - Add support for per action HW stats in TC. - Support hardware miss to TC action (continue processing in SW from a specific point in the action chain). - Warn if old Wireless Extension user space interface is used with modern cfg80211/mac80211 drivers. Do not support Wireless Extensions for Wi-Fi 7 devices at all. Everyone should switch to using nl80211 interface instead. - Improve the CAN bit timing configuration. Use extack to return error messages directly to user space, update the SJW handling, including the definition of a new default value that will benefit CAN-FD controllers, by increasing their oscillator tolerance. New hardware / drivers: - Ethernet: - nVidia BlueField-3 support (control traffic driver) - Ethernet support for imx93 SoCs - Motorcomm yt8531 gigabit Ethernet PHY - onsemi NCN26000 10BASE-T1S PHY (with support for PLCA) - Microchip LAN8841 PHY (incl. cable diagnostics and PTP) - Amlogic gxl MDIO mux - WiFi: - RealTek RTL8188EU (rtl8xxxu) - Qualcomm Wi-Fi 7 devices (ath12k) - CAN: - Renesas R-Car V4H Drivers: - Bluetooth: - Set Per Platform Antenna Gain (PPAG) for Intel controllers. - Ethernet NICs: - Intel (1G, igc): - support TSN / Qbv / packet scheduling features of i226 model - Intel (100G, ice): - use GNSS subsystem instead of TTY - multi-buffer XDP support - extend support for GPIO pins to E823 devices - nVidia/Mellanox: - update the shared buffer configuration on PFC commands - implement PTP adjphase function for HW offset control - TC support for Geneve and GRE with VF tunnel offload - more efficient crypto key management method - multi-port eswitch support - Netronome/Corigine: - add DCB IEEE support - support IPsec offloading for NFP3800 - Freescale/NXP (enetc): - support XDP_REDIRECT for XDP non-linear buffers - improve reconfig, avoid link flap and waiting for idle - support MAC Merge layer - Other NICs: - sfc/ef100: add basic devlink support for ef100 - ionic: rx_push mode operation (writing descriptors via MMIO) - bnxt: use the auxiliary bus abstraction for RDMA - r8169: disable ASPM and reset bus in case of tx timeout - cpsw: support QSGMII mode for J721e CPSW9G - cpts: support pulse-per-second output - ngbe: add an mdio bus driver - usbnet: optimize usbnet_bh() by avoiding unnecessary queuing - r8152: handle devices with FW with NCM support - amd-xgbe: support 10Mbps, 2.5GbE speeds and rx-adaptation - virtio-net: support multi buffer XDP - virtio/vsock: replace virtio_vsock_pkt with sk_buff - tsnep: XDP support - Ethernet high-speed switches: - nVidia/Mellanox (mlxsw): - add support for latency TLV (in FW control messages) - Microchip (sparx5): - separate explicit and implicit traffic forwarding rules, make the implicit rules always active - add support for egress DSCP rewrite - IS0 VCAP support (Ingress Classification) - IS2 VCAP filters (protos, L3 addrs, L4 ports, flags, ToS etc.) - ES2 VCAP support (Egress Access Control) - support for Per-Stream Filtering and Policing (802.1Q, 8.6.5.1) - Ethernet embedded switches: - Marvell (mv88e6xxx): - add MAB (port auth) offload support - enable PTP receive for mv88e6390 - NXP (ocelot): - support MAC Merge layer - support for the the vsc7512 internal copper phys - Microchip: - lan9303: convert to PHYLINK - lan966x: support TC flower filter statistics - lan937x: PTP support for KSZ9563/KSZ8563 and LAN937x - lan937x: support Credit Based Shaper configuration - ksz9477: support Energy Efficient Ethernet - other: - qca8k: convert to regmap read/write API, use bulk operations - rswitch: Improve TX timestamp accuracy - Intel WiFi (iwlwifi): - EHT (Wi-Fi 7) rate reporting - STEP equalizer support: transfer some STEP (connection to radio on platforms with integrated wifi) related parameters from the BIOS to the firmware. - Qualcomm 802.11ax WiFi (ath11k): - IPQ5018 support - Fine Timing Measurement (FTM) responder role support - channel 177 support - MediaTek WiFi (mt76): - per-PHY LED support - mt7996: EHT (Wi-Fi 7) support - Wireless Ethernet Dispatch (WED) reset support - switch to using page pool allocator - RealTek WiFi (rtw89): - support new version of Bluetooth co-existance - Mobile: - rmnet: support TX aggregation" * tag 'net-next-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1872 commits) page_pool: add a comment explaining the fragment counter usage net: ethtool: fix __ethtool_dev_mm_supported() implementation ethtool: pse-pd: Fix double word in comments xsk: add linux/vmalloc.h to xsk.c sefltests: netdevsim: wait for devlink instance after netns removal selftest: fib_tests: Always cleanup before exit net/mlx5e: Align IPsec ASO result memory to be as required by hardware net/mlx5e: TC, Set CT miss to the specific ct action instance net/mlx5e: Rename CHAIN_TO_REG to MAPPED_OBJ_TO_REG net/mlx5: Refactor tc miss handling to a single function net/mlx5: Kconfig: Make tc offload depend on tc skb extension net/sched: flower: Support hardware miss to tc action net/sched: flower: Move filter handle initialization earlier net/sched: cls_api: Support hardware miss to tc action net/sched: Rename user cookie and act cookie sfc: fix builds without CONFIG_RTC_LIB sfc: clean up some inconsistent indentings net/mlx4_en: Introduce flexible array to silence overflow warning net: lan966x: Fix possible deadlock inside PTP net/ulp: Remove redundant ->clone() test in inet_clone_ulp(). ...
2023-02-22Merge tag 'v6.3-p1' of ↵Linus Torvalds8-78/+71
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto update from Herbert Xu: "API: - Use kmap_local instead of kmap_atomic - Change request callback to take void pointer - Print FIPS status in /proc/crypto (when enabled) Algorithms: - Add rfc4106/gcm support on arm64 - Add ARIA AVX2/512 support on x86 Drivers: - Add TRNG driver for StarFive SoC - Delete ux500/hash driver (subsumed by stm32/hash) - Add zlib support in qat - Add RSA support in aspeed" * tag 'v6.3-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (156 commits) crypto: x86/aria-avx - Do not use avx2 instructions crypto: aspeed - Fix modular aspeed-acry crypto: hisilicon/qm - fix coding style issues crypto: hisilicon/qm - update comments to match function crypto: hisilicon/qm - change function names crypto: hisilicon/qm - use min() instead of min_t() crypto: hisilicon/qm - remove some unused defines crypto: proc - Print fips status crypto: crypto4xx - Call dma_unmap_page when done crypto: octeontx2 - Fix objects shared between several modules crypto: nx - Fix sparse warnings crypto: ecc - Silence sparse warning tls: Pass rec instead of aead_req into tls_encrypt_done crypto: api - Remove completion function scaffolding tls: Remove completion function scaffolding tipc: Remove completion function scaffolding net: ipv6: Remove completion function scaffolding net: ipv4: Remove completion function scaffolding net: macsec: Remove completion function scaffolding dm: Remove completion function scaffolding ...
2023-02-22Merge tag 'hyperv-next-signed-20230220' of ↵Linus Torvalds1-3/+1
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv updates from Wei Liu: - allow Linux to run as the nested root partition for Microsoft Hypervisor (Jinank Jain and Nuno Das Neves) - clean up the return type of callback functions (Dawei Li) * tag 'hyperv-next-signed-20230220' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: x86/hyperv: Fix hv_get/set_register for nested bringup Drivers: hv: Make remove callback of hyperv driver void returned Drivers: hv: Enable vmbus driver for nested root partition x86/hyperv: Add an interface to do nested hypercalls Drivers: hv: Setup synic registers in case of nested root partition x86/hyperv: Add support for detecting nested hypervisor
2023-02-22netfilter: ctnetlink: make event listener tracking globalFlorian Westphal3-5/+9
pernet tracking doesn't work correctly because other netns might have set NETLINK_LISTEN_ALL_NSID on its event socket. In this case its expected that events originating in other net namespaces are also received. Making pernet-tracking work while also honoring NETLINK_LISTEN_ALL_NSID requires much more intrusive changes both in netlink and nfnetlink, f.e. adding a 'setsockopt' callback that lets nfnetlink know that the event socket entered (or left) ALL_NSID mode. Move to global tracking instead: if there is an event socket anywhere on the system, all net namespaces which have conntrack enabled and use autobind mode will allocate the ecache extension. netlink_has_listeners() returns false only if the given group has no subscribers in any net namespace, the 'net' argument passed to nfnetlink_has_listeners is only used to derive the protocol (nfnetlink), it has no other effect. For proper NETLINK_LISTEN_ALL_NSID-aware pernet tracking of event listeners a new netlink_has_net_listeners() is also needed. Fixes: 90d1daa45849 ("netfilter: conntrack: add nf_conntrack_events autodetect mode") Reported-by: Bryce Kahle <bryce.kahle@datadoghq.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-02-22netfilter: xt_length: use skb len to match in length_mt6Xin Long1-2/+1
For IPv6 Jumbo packets, the ipv6_hdr(skb)->payload_len is always 0, and its real payload_len ( > 65535) is saved in hbh exthdr. With 0 length for the jumbo packets, it may mismatch. To fix this, we can just use skb->len instead of parsing exthdrs, as the hbh exthdr parsing has been done before coming to length_mt6 in ip6_rcv_core() and br_validate_ipv6() and also the packet has been trimmed according to the correct IPv6 (ext)hdr length there, and skb len is trustable in length_mt6(). Note that this patch is especially needed after the IPv6 BIG TCP was supported in kernel, which is using IPv6 Jumbo packets. Besides, to match the packets greater than 65535 more properly, a v1 revision of xt_length may be needed to extend "min, max" to u32 in the future, and for now the IPv6 Jumbo packets can be matched by: # ip6tables -m length ! --length 0:65535 Fixes: 7c4e983c4f3c ("net: allow gso_max_size to exceed 65536") Fixes: 0fe79f28bfaf ("net: allow gro_max_size to exceed 65536") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-02-22netfilter: ebtables: fix table blob use-after-freeFlorian Westphal3-5/+3
We are not allowed to return an error at this point. Looking at the code it looks like ret is always 0 at this point, but its not. t = find_table_lock(net, repl->name, &ret, &ebt_mutex); ... this can return a valid table, with ret != 0. This bug causes update of table->private with the new blob, but then frees the blob right away in the caller. Syzbot report: BUG: KASAN: vmalloc-out-of-bounds in __ebt_unregister_table+0xc00/0xcd0 net/bridge/netfilter/ebtables.c:1168 Read of size 4 at addr ffffc90005425000 by task kworker/u4:4/74 Workqueue: netns cleanup_net Call Trace: kasan_report+0xbf/0x1f0 mm/kasan/report.c:517 __ebt_unregister_table+0xc00/0xcd0 net/bridge/netfilter/ebtables.c:1168 ebt_unregister_table+0x35/0x40 net/bridge/netfilter/ebtables.c:1372 ops_exit_list+0xb0/0x170 net/core/net_namespace.c:169 cleanup_net+0x4ee/0xb10 net/core/net_namespace.c:613 ... ip(6)tables appears to be ok (ret should be 0 at this point) but make this more obvious. Fixes: c58dd2dd443c ("netfilter: Can't fail and free after table replacement") Reported-by: syzbot+f61594de72d6705aea03@syzkaller.appspotmail.com Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-02-22netfilter: ip6t_rpfilter: Fix regression with VRF interfacesPhil Sutter1-1/+3
When calling ip6_route_lookup() for the packet arriving on the VRF interface, the result is always the real (slave) interface. Expect this when validating the result. Fixes: acc641ab95b66 ("netfilter: rpfilter/fib: Populate flowic_l3mdev field") Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-02-22netfilter: conntrack: fix rmmod double-free raceFlorian Westphal3-14/+15
nf_conntrack_hash_check_insert() callers free the ct entry directly, via nf_conntrack_free. This isn't safe anymore because nf_conntrack_hash_check_insert() might place the entry into the conntrack table and then delteted the entry again because it found that a conntrack extension has been removed at the same time. In this case, the just-added entry is removed again and an error is returned to the caller. Problem is that another cpu might have picked up this entry and incremented its reference count. This results in a use-after-free/double-free, once by the other cpu and once by the caller of nf_conntrack_hash_check_insert(). Fix this by making nf_conntrack_hash_check_insert() not fail anymore after the insertion, just like before the 'Fixes' commit. This is safe because a racing nf_ct_iterate() has to wait for us to release the conntrack hash spinlocks. While at it, make the function return -EAGAIN in the rmmod (genid changed) case, this makes nfnetlink replay the command (suggested by Pablo Neira). Fixes: c56716c69ce1 ("netfilter: extensions: introduce extension genid count") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-02-22netfilter: ctnetlink: fix possible refcount leak in ctnetlink_create_conntrack()Hangyu Hua1-1/+4
nf_ct_put() needs to be called to put the refcount got by nf_conntrack_find_get() to avoid refcount leak when nf_conntrack_hash_check_insert() fails. Fixes: 7d367e06688d ("netfilter: ctnetlink: fix soft lockup when netlink adds new entries (v2)") Signed-off-by: Hangyu Hua <hbh25y@gmail.com> Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-02-21Merge tag 'hardening-v6.3-rc1' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening updates from Kees Cook: "Beyond some specific LoadPin, UBSAN, and fortify features, there are other fixes scattered around in various subsystems where maintainers were okay with me carrying them in my tree or were non-responsive but the patches were reviewed by others: - Replace 0-length and 1-element arrays with flexible arrays in various subsystems (Paulo Miguel Almeida, Stephen Rothwell, Kees Cook) - randstruct: Disable Clang 15 support (Eric Biggers) - GCC plugins: Drop -std=gnu++11 flag (Sam James) - strpbrk(): Refactor to use strchr() (Andy Shevchenko) - LoadPin LSM: Allow root filesystem switching when non-enforcing - fortify: Use dynamic object size hints when available - ext4: Fix CFI function prototype mismatch - Nouveau: Fix DP buffer size arguments - hisilicon: Wipe entire crypto DMA pool on error - coda: Fully allocate sig_inputArgs - UBSAN: Improve arm64 trap code reporting - copy_struct_from_user(): Add minimum bounds check on kernel buffer size" * tag 'hardening-v6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: randstruct: disable Clang 15 support uaccess: Add minimum bounds check on kernel buffer size arm64: Support Clang UBSAN trap codes for better reporting coda: Avoid partial allocation of sig_inputArgs gcc-plugins: drop -std=gnu++11 to fix GCC 13 build lib/string: Use strchr() in strpbrk() crypto: hisilicon: Wipe entire pool on error net/i40e: Replace 0-length array with flexible array io_uring: Replace 0-length array with flexible array ext4: Fix function prototype mismatch for ext4_feat_ktype i915/gvt: Replace one-element array with flexible-array member drm/nouveau/disp: Fix nvif_outp_acquire_dp() argument size LoadPin: Allow filesystem switch when not enforcing LoadPin: Move pin reporting cleanly out of locking LoadPin: Refactor sysctl initialization LoadPin: Refactor read-only check into a helper ARM: ixp4xx: Replace 0-length arrays with flexible arrays fortify: Use __builtin_dynamic_object_size() when available rxrpc: replace zero-lenth array with DECLARE_FLEX_ARRAY() helper
2023-02-21Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski3-67/+77
Per-next-PR merge. net/smc/af_smc.c b5dd4d698171 ("net/smc: llc_conf_mutex refactor, replace it with rw_semaphore") e40b801b3603 ("net/smc: fix potential panic dues to unprotected smc_llc_srv_add_link()") https://lore.kernel.org/all/20230221124008.6303c330@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-21net: ethtool: fix __ethtool_dev_mm_supported() implementationVladimir Oltean1-1/+1
The MAC Merge layer is supported when ops->get_mm() returns 0. The implementation was changed during review, and in this process, a bug was introduced. Link: https://lore.kernel.org/netdev/20230111161706.1465242-5-vladimir.oltean@nxp.com/ Fixes: 04692c9020b7 ("net: ethtool: netlink: retrieve stats from multiple sources (eMAC, pMAC)") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Ferenc Fejes <fejes@inf.elte.hu> Link: https://lore.kernel.org/all/20230220122343.1156614-2-vladimir.oltean@nxp.com/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-21ethtool: pse-pd: Fix double word in commentsBo Liu1-1/+1
Remove the repeated word "for" in comments. Signed-off-by: Bo Liu <liubo03@inspur.com> Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://lore.kernel.org/r/20230221083036.2414-1-liubo03@inspur.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-21xsk: add linux/vmalloc.h to xsk.cXuan Zhuo1-0/+1
Fix the failure of the compilation under the sh4. Because we introduced remap_vmalloc_range() earlier, this has caused the compilation failure on the sh4 platform. So this introduction of the header file of linux/vmalloc.h. config: sh-allmodconfig (https://download.01.org/0day-ci/archive/20230221/202302210041.kpPQLlNQ-lkp@intel.com/config) compiler: sh4-linux-gcc (GCC) 12.1.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/commit/?id=9f78bf330a66cd400b3e00f370f597e9fa939207 git remote add net-next https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git git fetch --no-tags net-next master git checkout 9f78bf330a66cd400b3e00f370f597e9fa939207 # save the config file mkdir build_dir && cp config build_dir/.config COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=sh olddefconfig COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=sh SHELL=/bin/bash net/ Fixes: 9f78bf330a66 ("xsk: support use vaddr as ring") Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/oe-kbuild-all/202302210041.kpPQLlNQ-lkp@intel.com/ Link: https://lore.kernel.org/r/20230221075140.46988-1-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-21net/sched: flower: Support hardware miss to tc actionPaul Blakey1-1/+12
To support hardware miss to tc action in actions on the flower classifier, implement the required getting of filter actions, and setup filter exts (actions) miss by giving it the filter's handle and actions. Signed-off-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-21net/sched: flower: Move filter handle initialization earlierPaul Blakey1-27/+35
To support miss to action during hardware offload the filter's handle is needed when setting up the actions (tcf_exts_init()), and before offloading. Move filter handle initialization earlier. Signed-off-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-21net/sched: cls_api: Support hardware miss to tc actionPaul Blakey3-12/+208
For drivers to support partial offload of a filter's action list, add support for action miss to specify an action instance to continue from in sw. CT action in particular can't be fully offloaded, as new connections need to be handled in software. This imposes other limitations on the actions that can be offloaded together with the CT action, such as packet modifications. Assign each action on a filter's action list a unique miss_cookie which drivers can then use to fill action_miss part of the tc skb extension. On getting back this miss_cookie, find the action instance with relevant cookie and continue classifying from there. Signed-off-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-21net/sched: Rename user cookie and act cookiePaul Blakey2-27/+27
struct tc_action->act_cookie is a user defined cookie, and the related struct flow_action_entry->act_cookie is used as an handle similar to struct flow_cls_offload->cookie. Rename tc_action->act_cookie to user_cookie, and flow_action_entry->act_cookie to cookie so their names would better fit their usage. Signed-off-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-21Merge tag 'ieee802154-for-net-next-2023-02-20' of ↵Jakub Kicinski14-34/+1099
git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan-next Stefan Schmidt says: ==================== pull-request: ieee802154-next 2023-02-20 Miquel Raynal build upon his earlier work and introduced two new features into the ieee802154 stack. Beaconing to announce existing PAN's and passive scanning to discover the beacons and associated PAN's. The matching changes to the userspace configuration tool have been posted as well and will be released together with the kernel release. Arnd Bergmann and Dmitry Torokhov worked on converting the at86rf230 and cc2520 drivers away from the unused platform_data usage and towards the new gpiod API. (I had to add a revert as Dmitry found a regression on an already pushed tree on my side). Changes since v1 (pull request 2023-02-02) - Netlink API extack and NLA_POLICY* usage as suggested by Jakub - Removed always true condition found by kernel test robot - Simplify device removal with running background job for scanning - Fix problems with beacon sending in some cases by using the MLME tx path * tag 'ieee802154-for-net-next-2023-02-20' of git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan-next: ieee802154: Drop device trackers mac802154: Fix an always true condition mac802154: Send beacons using the MLME Tx path ieee802154: Change error code on monitor scan netlink request ieee802154: Convert scan error messages to extack ieee802154: Use netlink policies when relevant on scan parameters ieee802154: at86rf230: switch to using gpiod API ieee802154: at86rf230: drop support for platform data Revert "at86rf230: convert to gpio descriptors" cc2520: move to gpio descriptors mac802154: Avoid superfluous endianness handling at86rf230: convert to gpio descriptors mac802154: Handle basic beaconing ieee802154: Add support for user beaconing requests mac802154: Handle passive scanning mac802154: Add MLME Tx locked helpers mac802154: Prepare forcing specific symbol duration ieee802154: Introduce a helper to validate a channel ieee802154: Define a beacon frame header ieee802154: Add support for user scanning requests ==================== Link: https://lore.kernel.org/r/20230220213749.386451-1-stefan@datenfreihafen.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-21net/ulp: Remove redundant ->clone() test in inet_clone_ulp().Kuniyuki Iwashima1-2/+1
Commit 2c02d41d71f9 ("net/ulp: prevent ULP without clone op from entering the LISTEN status") guarantees that all ULP listeners have clone() op, so we no longer need to test it in inet_clone_ulp(). Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20230217200920.85306-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-21Merge tag 'for-netdev' of ↵Jakub Kicinski3-41/+63
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2023-02-17 We've added 64 non-merge commits during the last 7 day(s) which contain a total of 158 files changed, 4190 insertions(+), 988 deletions(-). The main changes are: 1) Add a rbtree data structure following the "next-gen data structure" precedent set by recently-added linked-list, that is, by using kfunc + kptr instead of adding a new BPF map type, from Dave Marchevsky. 2) Add a new benchmark for hashmap lookups to BPF selftests, from Anton Protopopov. 3) Fix bpf_fib_lookup to only return valid neighbors and add an option to skip the neigh table lookup, from Martin KaFai Lau. 4) Add cgroup.memory=nobpf kernel parameter option to disable BPF memory accouting for container environments, from Yafang Shao. 5) Batch of ice multi-buffer and driver performance fixes, from Alexander Lobakin. 6) Fix a bug in determining whether global subprog's argument is PTR_TO_CTX, which is based on type names which breaks kprobe progs, from Andrii Nakryiko. 7) Prep work for future -mcpu=v4 LLVM option which includes usage of BPF_ST insn. Thus improve BPF_ST-related value tracking in verifier, from Eduard Zingerman. 8) More prep work for later building selftests with Memory Sanitizer in order to detect usages of undefined memory, from Ilya Leoshkevich. 9) Fix xsk sockets to check IFF_UP earlier to avoid a NULL pointer dereference via sendmsg(), from Maciej Fijalkowski. 10) Implement BPF trampoline for RV64 JIT compiler, from Pu Lehui. 11) Fix BPF memory allocator in combination with BPF hashtab where it could corrupt special fields e.g. used in bpf_spin_lock, from Hou Tao. 12) Fix LoongArch BPF JIT to always use 4 instructions for function address so that instruction sequences don't change between passes, from Hengqi Chen. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (64 commits) selftests/bpf: Add bpf_fib_lookup test bpf: Add BPF_FIB_LOOKUP_SKIP_NEIGH for bpf_fib_lookup riscv, bpf: Add bpf trampoline support for RV64 riscv, bpf: Add bpf_arch_text_poke support for RV64 riscv, bpf: Factor out emit_call for kernel and bpf context riscv: Extend patch_text for multiple instructions Revert "bpf, test_run: fix &xdp_frame misplacement for LIVE_FRAMES" selftests/bpf: Add global subprog context passing tests selftests/bpf: Convert test_global_funcs test to test_loader framework bpf: Fix global subprog context argument resolution logic LoongArch, bpf: Use 4 instructions for function address in JIT bpf: bpf_fib_lookup should not return neigh in NUD_FAILED state bpf: Disable bh in bpf_test_run for xdp and tc prog xsk: check IFF_UP earlier in Tx path Fix typos in selftest/bpf files selftests/bpf: Use bpf_{btf,link,map,prog}_get_info_by_fd() samples/bpf: Use bpf_{btf,link,map,prog}_get_info_by_fd() bpftool: Use bpf_{btf,link,map,prog}_get_info_by_fd() libbpf: Use bpf_{btf,link,map,prog}_get_info_by_fd() libbpf: Introduce bpf_{btf,link,map,prog}_get_info_by_fd() ... ==================== Link: https://lore.kernel.org/r/20230217221737.31122-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-21Merge tag 'for-6.3/block-2023-02-16' of git://git.kernel.dk/linuxLinus Torvalds5-35/+20
Pull block updates from Jens Axboe: - NVMe updates via Christoph: - Small improvements to the logging functionality (Amit Engel) - Authentication cleanups (Hannes Reinecke) - Cleanup and optimize the DMA mapping cod in the PCIe driver (Keith Busch) - Work around the command effects for Format NVM (Keith Busch) - Misc cleanups (Keith Busch, Christoph Hellwig) - Fix and cleanup freeing single sgl (Keith Busch) - MD updates via Song: - Fix a rare crash during the takeover process - Don't update recovery_cp when curr_resync is ACTIVE - Free writes_pending in md_stop - Change active_io to percpu - Updates to drbd, inching us closer to unifying the out-of-tree driver with the in-tree one (Andreas, Christoph, Lars, Robert) - BFQ update adding support for multi-actuator drives (Paolo, Federico, Davide) - Make brd compliant with REQ_NOWAIT (me) - Fix for IOPOLL and queue entering, fixing stalled IO waiting on timeouts (me) - Fix for REQ_NOWAIT with multiple bios (me) - Fix memory leak in blktrace cleanup (Greg) - Clean up sbitmap and fix a potential hang (Kemeng) - Clean up some bits in BFQ, and fix a bug in the request injection (Kemeng) - Clean up the request allocation and issue code, and fix some bugs related to that (Kemeng) - ublk updates and fixes: - Add support for unprivileged ublk (Ming) - Improve device deletion handling (Ming) - Misc (Liu, Ziyang) - s390 dasd fixes (Alexander, Qiheng) - Improve utility of request caching and fixes (Anuj, Xiao) - zoned cleanups (Pankaj) - More constification for kobjs (Thomas) - blk-iocost cleanups (Yu) - Remove bio splitting from drivers that don't need it (Christoph) - Switch blk-cgroups to use struct gendisk. Some of this is now incomplete as select late reverts were done. (Christoph) - Add bvec initialization helpers, and convert callers to use that rather than open-coding it (Christoph) - Misc fixes and cleanups (Jinke, Keith, Arnd, Bart, Li, Martin, Matthew, Ulf, Zhong) * tag 'for-6.3/block-2023-02-16' of git://git.kernel.dk/linux: (169 commits) brd: use radix_tree_maybe_preload instead of radix_tree_preload block: use proper return value from bio_failfast() block: bio-integrity: Copy flags when bio_integrity_payload is cloned block: Fix io statistics for cgroup in throttle path brd: mark as nowait compatible brd: check for REQ_NOWAIT and set correct page allocation mask brd: return 0/-error from brd_insert_page() block: sync mixed merged request's failfast with 1st bio's Revert "blk-cgroup: pin the gendisk in struct blkcg_gq" Revert "blk-cgroup: pass a gendisk to blkg_lookup" Revert "blk-cgroup: delay blk-cgroup initialization until add_disk" Revert "blk-cgroup: delay calling blkcg_exit_disk until disk_release" Revert "blk-cgroup: move the cgroup information to struct gendisk" nvme-pci: remove iod use_sgls nvme-pci: fix freeing single sgl block: ublk: check IO buffer based on flag need_get_data s390/dasd: Fix potential memleak in dasd_eckd_init() s390/dasd: sort out physical vs virtual pointers usage block: Remove the ALLOC_CACHE_SLACK constant block: make kobj_type structures constant ...
2023-02-20Merge tag 'fs.idmapped.v6.3' of ↵Linus Torvalds2-7/+7
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping Pull vfs idmapping updates from Christian Brauner: - Last cycle we introduced the dedicated struct mnt_idmap type for mount idmapping and the required infrastucture in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). As promised in last cycle's pull request message this converts everything to rely on struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevant on the mount level. Especially for non-vfs developers without detailed knowledge in this area this was a potential source for bugs. This finishes the conversion. Instead of passing the plain namespace around this updates all places that currently take a pointer to a mnt_userns with a pointer to struct mnt_idmap. Now that the conversion is done all helpers down to the really low-level helpers only accept a struct mnt_idmap argument instead of two namespace arguments. Conflating mount and other idmappings will now cause the compiler to complain loudly thus eliminating the possibility of any bugs. This makes it impossible for filesystem developers to mix up mount and filesystem idmappings as they are two distinct types and require distinct helpers that cannot be used interchangeably. Everything associated with struct mnt_idmap is moved into a single separate file. With that change no code can poke around in struct mnt_idmap. It can only be interacted with through dedicated helpers. That means all filesystems are and all of the vfs is completely oblivious to the actual implementation of idmappings. We are now also able to extend struct mnt_idmap as we see fit. For example, we can decouple it completely from namespaces for users that don't require or don't want to use them at all. We can also extend the concept of idmappings so we can cover filesystem specific requirements. In combination with the vfs{g,u}id_t work we finished in v6.2 this makes this feature substantially more robust and thus difficult to implement wrong by a given filesystem and also protects the vfs. - Enable idmapped mounts for tmpfs and fulfill a longstanding request. A long-standing request from users had been to make it possible to create idmapped mounts for tmpfs. For example, to share the host's tmpfs mount between multiple sandboxes. This is a prerequisite for some advanced Kubernetes cases. Systemd also has a range of use-cases to increase service isolation. And there are more users of this. However, with all of the other work going on this was way down on the priority list but luckily someone other than ourselves picked this up. As usual the patch is tiny as all the infrastructure work had been done multiple kernel releases ago. In addition to all the tests that we already have I requested that Rodrigo add a dedicated tmpfs testsuite for idmapped mounts to xfstests. It is to be included into xfstests during the v6.3 development cycle. This should add a slew of additional tests. * tag 'fs.idmapped.v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping: (26 commits) shmem: support idmapped mounts for tmpfs fs: move mnt_idmap fs: port vfs{g,u}id helpers to mnt_idmap fs: port fs{g,u}id helpers to mnt_idmap fs: port i_{g,u}id_into_vfs{g,u}id() to mnt_idmap fs: port i_{g,u}id_{needs_}update() to mnt_idmap quota: port to mnt_idmap fs: port privilege checking helpers to mnt_idmap fs: port inode_owner_or_capable() to mnt_idmap fs: port inode_init_owner() to mnt_idmap fs: port acl to mnt_idmap fs: port xattr to mnt_idmap fs: port ->permission() to pass mnt_idmap fs: port ->fileattr_set() to pass mnt_idmap fs: port ->set_acl() to pass mnt_idmap fs: port ->get_acl() to pass mnt_idmap fs: port ->tmpfile() to pass mnt_idmap fs: port ->rename() to pass mnt_idmap fs: port ->mknod() to pass mnt_idmap fs: port ->mkdir() to pass mnt_idmap ...
2023-02-20SUNRPC: Fix occasional warning when destroying gss_krb5_enctypesChuck Lever2-6/+8
I'm guessing that the warning fired because there's some code path that is called on module unload where the gss_krb5_enctypes file was never set up. name 'gss_krb5_enctypes' WARNING: CPU: 0 PID: 6187 at fs/proc/generic.c:712 remove_proc_entry+0x38d/0x460 fs/proc/generic.c:712 destroy_krb5_enctypes_proc_entry net/sunrpc/auth_gss/svcauth_gss.c:1543 [inline] gss_svc_shutdown_net+0x7d/0x2b0 net/sunrpc/auth_gss/svcauth_gss.c:2120 ops_exit_list+0xb0/0x170 net/core/net_namespace.c:169 setup_net+0x9bd/0xe60 net/core/net_namespace.c:356 copy_net_ns+0x320/0x6b0 net/core/net_namespace.c:483 create_new_namespaces+0x3f6/0xb20 kernel/nsproxy.c:110 copy_namespaces+0x410/0x500 kernel/nsproxy.c:179 copy_process+0x311d/0x76b0 kernel/fork.c:2272 kernel_clone+0xeb/0x9a0 kernel/fork.c:2684 __do_sys_clone+0xba/0x100 kernel/fork.c:2825 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Reported-by: syzbot+04a8437497bcfb4afa95@syzkaller.appspotmail.com Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20SUNRPC: Remove ->xpo_secure_port()Chuck Lever4-10/+3
There's no need for the cost of this extra virtual function call during every RPC transaction: the RQ_SECURE bit can be set properly in ->xpo_recvfrom() instead. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20SUNRPC: Fix whitespace damage in svcauth_unix.cChuck Lever1-3/+3
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20SUNRPC: Add encryption self-testsChuck Lever3-5/+142
With the KUnit infrastructure recently added, we are free to define other unit tests particular to our implementation. As an example, I've added a self-test that encrypts then decrypts a string, and checks the result. Tested-by: Scott Mayhew <smayhew@redhat.com> Reviewed-by: Simo Sorce <simo@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20SUNRPC: Add RFC 8009 encryption KUnit testsChuck Lever3-4/+354
RFC 8009 provides sample encryption results. Add KUnit tests to ensure our implementation derives the expected results for the provided sample input. I hate how large this test is, but using non-standard key usage values means rfc8009_encrypt_case() can't simply reuse ->import_ctx to allocate and key its ciphers; and the test provides its own confounders, which means krb5_etm_encrypt() can't be used directly. Tested-by: Scott Mayhew <smayhew@redhat.com> Reviewed-by: Simo Sorce <simo@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20SUNRPC: Add RFC 8009 checksum KUnit testsChuck Lever1-0/+53
RFC 8009 provides sample checksum results. Add KUnit tests to ensure our implementation derives the expected results for the provided sample input. Tested-by: Scott Mayhew <smayhew@redhat.com> Reviewed-by: Simo Sorce <simo@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20SUNRPC: Add KDF-HMAC-SHA2 Kunit testsChuck Lever2-1/+115
RFC 8009 provides sample key derivation results, so Kunit tests are added to ensure our implementation derives the expected keys for the provided sample input. Tested-by: Scott Mayhew <smayhew@redhat.com> Reviewed-by: Simo Sorce <simo@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20SUNRPC: Add encryption KUnit tests for the RFC 6803 encryption typesChuck Lever1-0/+400
Add tests for the new-to-RPCSEC Camellia cipher. Tested-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20SUNRPC: Add checksum KUnit tests for the RFC 6803 encryption typesChuck Lever2-0/+169
Test the new-to-RPCSEC CMAC digest algorithm. Tested-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20SUNRPC: Add KDF KUnit tests for the RFC 6803 encryption typesChuck Lever2-1/+127
The Camellia enctypes use a new KDF, so add some tests to ensure it is working properly. Tested-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20SUNRPC: Add Kunit tests for RFC 3962-defined encryption/decryptionChuck Lever4-9/+296
Add Kunit tests for ENCTYPE_AES128_CTS_HMAC_SHA1_96. The test vectors come from RFC 3962 Appendix B. Tested-by: Scott Mayhew <smayhew@redhat.com> Reviewed-by: Simo Sorce <simo@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20SUNRPC: Add KUnit tests RFC 3961 Key DerivationChuck Lever1-0/+227
RFC 3961 Appendix A provides tests for the KDF specified in that document as well as other parts of Kerberos. The other three usage scenarios in Section 10 are not implemented by the Linux kernel's RPCSEC GSS Kerberos 5 mechanism, so tests are not added for those. Tested-by: Scott Mayhew <smayhew@redhat.com> Reviewed-by: Simo Sorce <simo@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20SUNRPC: Export get_gss_krb5_enctype()Chuck Lever2-19/+17
I plan to add KUnit tests that will need enctype profile information. Export the enctype profile lookup function. Tested-by: Scott Mayhew <smayhew@redhat.com> Reviewed-by: Simo Sorce <simo@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20SUNRPC: Add KUnit tests for rpcsec_krb5.koChuck Lever6-4/+298
The Kerberos RFCs provide test vectors to verify the operation of an implementation. Introduce a KUnit test framework to exercise the Linux kernel's implementation of Kerberos. Start with test cases for the RFC 3961-defined n-fold function. The sample vectors for that are found in RFC 3961 Section 10. Run the GSS Kerberos 5 mechanism's unit tests with this command: $ ./tools/testing/kunit/kunit.py run \ --kunitconfig ./net/sunrpc/.kunitconfig Tested-by: Scott Mayhew <smayhew@redhat.com> Reviewed-by: Simo Sorce <simo@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>