summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2022-07-22wifi: mac80211: optionally implement MLO multicast TXJohannes Berg1-0/+10
For drivers using software encryption for multicast TX, such as mac80211_hwsim, mac80211 needs to duplicate the multicast frames on each link, if MLO is enabled. Do this, but don't just make it dependent on the key but provide a separate flag for drivers to opt out of this. This is not very efficient, I expect that drivers will do it in firmware/hardware or at least with DMA engine assistence, so this is mostly for hwsim. To make this work, also implement the SNS11 sequence number space that an AP MLD shall have, and modify the API to the __ieee80211_subif_start_xmit() function to always require the link ID bits to be set. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-07-22wifi: mac80211: expand ieee80211_mgmt_tx() for MLOJohannes Berg1-1/+3
There are a couple of new things that should be possible with MLO: * selecting the link to transmit to a station by link ID, which a previous patch added to the nl80211 API * selecting the link by frequency, similarly * allowing transmittion to an MLD without specifying any channel or link ID, with MLD addresses Enable these use cases. Also fix the address comparison in client mode to use the AP (MLD) address. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-07-22wifi: nl80211: add MLO link ID to the NL80211_CMD_FRAME TX APIJohannes Berg2-0/+8
Allow optionally specifying the link ID to transmit on, which can be done instead of the link frequency, on an MLD addressed frame. Both can also be omitted in which case the frame must be MLD addressed and link selection (and address translation) will be done on lower layers. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-07-22wifi: cfg80211: report link ID in NL80211_CMD_FRAMEJohannes Berg1-0/+5
If given by the underlying driver, report the link ID for MLO in NL80211_CMD_FRAME. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-07-22wifi: mac80211: add hardware timestamps for RX and TXAvraham Stern1-1/+27
When the low level driver reports hardware timestamps for frame TX status or frame RX, pass the timestamps to cfg80211. Signed-off-by: Avraham Stern <avraham.stern@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-07-22wifi: cfg80211: add hardware timestamps to frame RX infoAvraham Stern1-0/+4
Add hardware timestamps to management frame RX info. This shall be used by drivers that support hardware timestamping for Timing measurement and Fine timing measurement action frames RX. Signed-off-by: Avraham Stern <avraham.stern@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-07-22wifi: cfg80211/nl80211: move rx management data into a structAvraham Stern1-4/+56
The functions for reporting rx management take many arguments. Collect all the arguments into a struct, which also make it easier to add more arguments if needed. Signed-off-by: Avraham Stern <avraham.stern@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-07-22wifi: cfg80211: add a function for reporting TX status with hardware timestampsAvraham Stern1-2/+45
Add a function for reporting TX status with hardware timestamps. This function shall be used for reporting the TX status of Timing measurement and Fine timing measurement action frames by devices that support reporting hardware timestamps. Signed-off-by: Avraham Stern <avraham.stern@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-07-22wifi: nl80211: add RX and TX timestamp attributesAvraham Stern1-1/+21
Add attributes for reporting hardware timestamps for management frames RX and TX. These attributes will be used for reporting hardware timestamps for Timing measurement and Fine Timing Measurement action frames, which will allow userspace applications to measure the path delay between devices and sync clocks. For TX, these attributes are used for reporting the frame RX time and the ack TX time. For TX, they are used for reporting the frame TX time and the ack RX time. Signed-off-by: Avraham Stern <avraham.stern@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-07-22wifi: ieee80211: add helper functions for detecting TM/FTM framesAvraham Stern1-0/+54
Add helper functions for detection timing measurement and fine timing measurement frames. Signed-off-by: Avraham Stern <avraham.stern@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-07-22wifi: nl80211/mac80211: clarify link ID in control port TXJohannes Berg1-0/+6
Clarify the link ID behaviour in control port TX, we need it to select the link to transmit on for both MLD and non-MLD receivers, but select the link address as the SA only if the receiver is not an MLD. Fixes: 67207bab9341 ("wifi: cfg80211/mac80211: Support control port TX from specific link") Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-07-22net: add missing includes and forward declarations under net/Jakub Kicinski63-11/+183
This patch adds missing includes to headers under include/net. All these problems are currently masked by the existing users including the missing dependency before the broken header. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22tcp: Fix a data-race around sysctl_tcp_adv_win_scale.Kuniyuki Iwashima1-1/+1
While reading sysctl_tcp_adv_win_scale, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22net: netfilter: Add kfuncs to set and change CT statusLorenzo Bianconi1-0/+2
Introduce bpf_ct_set_status and bpf_ct_change_status kfunc helpers in order to set nf_conn field of allocated entry or update nf_conn status field of existing inserted entry. Use nf_ct_change_status_common to share the permitted status field changes between netlink and BPF side by refactoring ctnetlink_change_status. It is required to introduce two kfuncs taking nf_conn___init and nf_conn instead of sharing one because KF_TRUSTED_ARGS flag causes strict type checking. This would disallow passing nf_conn___init to kfunc taking nf_conn, and vice versa. We cannot remove the KF_TRUSTED_ARGS flag as we only want to accept refcounted pointers and not e.g. ct->master. Hence, bpf_ct_set_* kfuncs are meant to be used on allocated CT, and bpf_ct_change_* kfuncs are meant to be used on inserted or looked up CT entry. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Co-developed-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20220721134245.2450-10-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-22net: netfilter: Add kfuncs to set and change CT timeoutKumar Kartikeya Dwivedi1-0/+2
Introduce bpf_ct_set_timeout and bpf_ct_change_timeout kfunc helpers in order to change nf_conn timeout. This is same as ctnetlink_change_timeout, hence code is shared between both by extracting it out to __nf_ct_change_timeout. It is also updated to return an error when it sees IPS_FIXED_TIMEOUT_BIT bit in ct->status, as that check was missing. It is required to introduce two kfuncs taking nf_conn___init and nf_conn instead of sharing one because KF_TRUSTED_ARGS flag causes strict type checking. This would disallow passing nf_conn___init to kfunc taking nf_conn, and vice versa. We cannot remove the KF_TRUSTED_ARGS flag as we only want to accept refcounted pointers and not e.g. ct->master. Apart from this, bpf_ct_set_timeout is only called for newly allocated CT so it doesn't need to inspect the status field just yet. Sharing the helpers even if it was possible would make timeout setting helper sensitive to order of setting status and timeout after allocation. Hence, bpf_ct_set_* kfuncs are meant to be used on allocated CT, and bpf_ct_change_* kfuncs are meant to be used on inserted or looked up CT entry. Co-developed-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20220721134245.2450-9-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-22net: netfilter: Add kfuncs to allocate and insert CTLorenzo Bianconi1-0/+15
Introduce bpf_xdp_ct_alloc, bpf_skb_ct_alloc and bpf_ct_insert_entry kfuncs in order to insert a new entry from XDP and TC programs. Introduce bpf_nf_ct_tuple_parse utility routine to consolidate common code. We extract out a helper __nf_ct_set_timeout, used by the ctnetlink and nf_conntrack_bpf code, extract it out to nf_conntrack_core, so that nf_conntrack_bpf doesn't need a dependency on CONFIG_NF_CT_NETLINK. Later this helper will be reused as a helper to set timeout of allocated but not yet inserted CT entry. The allocation functions return struct nf_conn___init instead of nf_conn, to distinguish allocated CT from an already inserted or looked up CT. This is later used to enforce restrictions on what kfuncs allocated CT can be used with. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Co-developed-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20220721134245.2450-8-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-22bpf: Add support for forcing kfunc args to be trustedKumar Kartikeya Dwivedi1-0/+32
Teach the verifier to detect a new KF_TRUSTED_ARGS kfunc flag, which means each pointer argument must be trusted, which we define as a pointer that is referenced (has non-zero ref_obj_id) and also needs to have its offset unchanged, similar to how release functions expect their argument. This allows a kfunc to receive pointer arguments unchanged from the result of the acquire kfunc. This is required to ensure that kfunc that operate on some object only work on acquired pointers and not normal PTR_TO_BTF_ID with same type which can be obtained by pointer walking. The restrictions applied to release arguments also apply to trusted arguments. This implies that strict type matching (not deducing type by recursively following members at offset) and OBJ_RELEASE offset checks (ensuring they are zero) are used for trusted pointer arguments. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20220721134245.2450-5-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-22bpf: Switch to new kfunc flags infrastructureKumar Kartikeya Dwivedi2-24/+12
Instead of populating multiple sets to indicate some attribute and then researching the same BTF ID in them, prepare a single unified BTF set which indicates whether a kfunc is allowed to be called, and also its attributes if any at the same time. Now, only one call is needed to perform the lookup for both kfunc availability and its attributes. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20220721134245.2450-4-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-22bpf: Introduce 8-byte BTF setKumar Kartikeya Dwivedi1-4/+64
Introduce support for defining flags for kfuncs using a new set of macros, BTF_SET8_START/BTF_SET8_END, which define a set which contains 8 byte elements (each of which consists of a pair of BTF ID and flags), using a new BTF_ID_FLAGS macro. This will be used to tag kfuncs registered for a certain program type as acquire, release, sleepable, ret_null, etc. without having to create more and more sets which was proving to be an unscalable solution. Now, when looking up whether a kfunc is allowed for a certain program, we can also obtain its kfunc flags in the same call and avoid further lookups. The resolve_btfids change is split into a separate patch. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20220721134245.2450-2-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-22Merge tag 'drm-misc-fixes-2022-07-21' of ↵Dave Airlie1-2/+2
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes A scheduling-while-atomic fix for drm/scheduler, a locking fix for TTM, a typo fix for panel-edp and a resource removal fix for imx/dcss Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20220721085550.hrwbukj34y56rzva@houat
2022-07-22Bluetooth: mgmt: Fix using hci_conn_abortLuiz Augusto von Dentz1-0/+2
This fixes using hci_conn_abort instead of using hci_conn_abort_sync. Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2022-07-22Bluetooth: Add bt_statusLuiz Augusto von Dentz1-0/+1
This adds bt_status which can be used to convert Unix errno to Bluetooth status. Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2022-07-22Bluetooth: hci_sync: Refactor remove Adv MonitorManish Mandlik1-4/+2
Make use of hci_cmd_sync_queue for removing an advertisement monitor. Signed-off-by: Manish Mandlik <mmandlik@google.com> Reviewed-by: Miao-chen Chou <mcchou@google.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2022-07-22Bluetooth: hci_sync: Refactor add Adv MonitorManish Mandlik1-4/+1
Make use of hci_cmd_sync_queue for adding an advertisement monitor. Signed-off-by: Manish Mandlik <mmandlik@google.com> Reviewed-by: Miao-chen Chou <mcchou@google.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2022-07-22Bluetooth: hci_sync: Remove HCI_QUIRK_BROKEN_ERR_DATA_REPORTINGZijun Hu1-11/+0
Core driver addtionally checks LMP feature bit "Erroneous Data Reporting" instead of quirk HCI_QUIRK_BROKEN_ERR_DATA_REPORTING to decide if HCI commands HCI_Read|Write_Default_Erroneous_Data_Reporting are broken, so remove this unnecessary quirk. Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Tested-by: Zijun Hu <quic_zijuhu@quicinc.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2022-07-22Bluetooth: hci_sync: Check LMP feature bit instead of quirkZijun Hu1-0/+1
BT core driver should addtionally check LMP feature bit "Erroneous Data Reporting" instead of quirk HCI_QUIRK_BROKEN_ERR_DATA_REPORTING set by BT device driver to decide if HCI commands HCI_Read|Write_Default_Erroneous_Data_Reporting are broken. BLUETOOTH CORE SPECIFICATION Version 5.3 | Vol 2, Part C | page 587 This feature indicates whether the device is able to support the Packet_Status_Flag and the HCI commands HCI_Write_Default_- Erroneous_Data_Reporting and HCI_Read_Default_Erroneous_- Data_Reporting. the quirk was introduced by 'commit cde1a8a99287 ("Bluetooth: btusb: Fix and detect most of the Chinese Bluetooth controllers")' to mark HCI commands HCI_Read|Write_Default_Erroneous_Data_Reporting broken by BT device driver, but the reason why these two HCI commands are broken is that feature "Erroneous Data Reporting" is not enabled by firmware, this scenario is illustrated by below log of QCA controllers with USB I/F: @ RAW Open: hcitool (privileged) version 2.22 < HCI Command: Read Local Supported Commands (0x04|0x0002) plen 0 > HCI Event: Command Complete (0x0e) plen 68 Read Local Supported Commands (0x04|0x0002) ncmd 1 Status: Success (0x00) Commands: 288 entries ...... Read Default Erroneous Data Reporting (Octet 18 - Bit 2) Write Default Erroneous Data Reporting (Octet 18 - Bit 3) ...... < HCI Command: Read Default Erroneous Data Reporting (0x03|0x005a) plen 0 > HCI Event: Command Complete (0x0e) plen 4 Read Default Erroneous Data Reporting (0x03|0x005a) ncmd 1 Status: Unknown HCI Command (0x01) < HCI Command: Read Local Supported Features (0x04|0x0003) plen 0 > HCI Event: Command Complete (0x0e) plen 12 Read Local Supported Features (0x04|0x0003) ncmd 1 Status: Success (0x00) Features: 0xff 0xfe 0x0f 0xfe 0xd8 0x3f 0x5b 0x87 3 slot packets ...... Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Tested-by: Zijun Hu <quic_zijuhu@quicinc.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2022-07-22Bluetooth: clean up error pointer checkingDan Carpenter1-1/+1
The bt_skb_sendmsg() function can't return NULL so there is no need to check for that. Several of these checks were removed previously but this one was missed. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2022-07-22Bluetooth: HCI: Fix not always setting Scan Response/Advertising DataLuiz Augusto von Dentz1-0/+11
The scan response and advertising data needs to be tracked on a per instance (adv_info) since when these instaces are removed so are their data, to fix that new flags are introduced which is used to mark when the data changes and then checked to confirm when the data needs to be synced with the controller. Tested-by: Tedd Ho-Jeong An <tedd.an@intel.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2022-07-22Bluetooth: Unregister suspend with userchannelAbhishek Pandit-Subedi1-0/+2
When HCI_USERCHANNEL is used, unregister the suspend notifier when binding and register when releasing. The userchannel socket should be left alone after open is completed. Signed-off-by: Abhishek Pandit-Subedi <abhishekpandit@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2022-07-22Bluetooth: When HCI work queue is drained, only queue chained workSchspa Shi1-0/+1
The HCI command, event, and data packet processing workqueue is drained to avoid deadlock in commit 76727c02c1e1 ("Bluetooth: Call drain_workqueue() before resetting state"). There is another delayed work, which will queue command to this drained workqueue. Which results in the following error report: Bluetooth: hci2: command 0x040f tx timeout WARNING: CPU: 1 PID: 18374 at kernel/workqueue.c:1438 __queue_work+0xdad/0x1140 Workqueue: events hci_cmd_timeout RIP: 0010:__queue_work+0xdad/0x1140 RSP: 0000:ffffc90002cffc60 EFLAGS: 00010093 RAX: 0000000000000000 RBX: ffff8880b9d3ec00 RCX: 0000000000000000 RDX: ffff888024ba0000 RSI: ffffffff814e048d RDI: ffff8880b9d3ec08 RBP: 0000000000000008 R08: 0000000000000000 R09: 00000000b9d39700 R10: ffffffff814f73c6 R11: 0000000000000000 R12: ffff88807cce4c60 R13: 0000000000000000 R14: ffff8880796d8800 R15: ffff8880796d8800 FS: 0000000000000000(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000c0174b4000 CR3: 000000007cae9000 CR4: 00000000003506e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> ? queue_work_on+0xcb/0x110 ? lockdep_hardirqs_off+0x90/0xd0 queue_work_on+0xee/0x110 process_one_work+0x996/0x1610 ? pwq_dec_nr_in_flight+0x2a0/0x2a0 ? rwlock_bug.part.0+0x90/0x90 ? _raw_spin_lock_irq+0x41/0x50 worker_thread+0x665/0x1080 ? process_one_work+0x1610/0x1610 kthread+0x2e9/0x3a0 ? kthread_complete_and_exit+0x40/0x40 ret_from_fork+0x1f/0x30 </TASK> To fix this, we can add a new HCI_DRAIN_WQ flag, and don't queue the timeout workqueue while command workqueue is draining. Fixes: 76727c02c1e1 ("Bluetooth: Call drain_workqueue() before resetting state") Reported-by: syzbot+63bed493aebbf6872647@syzkaller.appspotmail.com Signed-off-by: Schspa Shi <schspa@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2022-07-21Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski16-59/+106
No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-21Merge tag 'net-5.19-rc8' of ↵Linus Torvalds9-23/+45
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from can. Still no major regressions, most of the changes are still due to data races fixes, plus the usual bunch of drivers fixes. Previous releases - regressions: - tcp/udp: make early_demux back namespacified. - dsa: fix issues with vlan_filtering_is_global Previous releases - always broken: - ip: fix data-races around ipv4_net_table (round 2, 3 & 4) - amt: fix validation and synchronization bugs - can: fix detection of mcp251863 - eth: iavf: fix handling of dummy receive descriptors - eth: lan966x: fix issues with MAC table - eth: stmmac: dwmac-mediatek: fix clock issue Misc: - dsa: update documentation" * tag 'net-5.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (107 commits) mlxsw: spectrum_router: Fix IPv4 nexthop gateway indication net/sched: cls_api: Fix flow action initialization tcp: Fix data-races around sysctl_tcp_max_reordering. tcp: Fix a data-race around sysctl_tcp_abort_on_overflow. tcp: Fix a data-race around sysctl_tcp_rfc1337. tcp: Fix a data-race around sysctl_tcp_stdurg. tcp: Fix a data-race around sysctl_tcp_retrans_collapse. tcp: Fix data-races around sysctl_tcp_slow_start_after_idle. tcp: Fix a data-race around sysctl_tcp_thin_linear_timeouts. tcp: Fix data-races around sysctl_tcp_recovery. tcp: Fix a data-race around sysctl_tcp_early_retrans. tcp: Fix data-races around sysctl knobs related to SYN option. udp: Fix a data-race around sysctl_udp_l3mdev_accept. ip: Fix data-races around sysctl_ip_prot_sock. ipv4: Fix data-races around sysctl_fib_multipath_hash_fields. ipv4: Fix data-races around sysctl_fib_multipath_hash_policy. ipv4: Fix a data-race around sysctl_fib_multipath_use_neigh. can: rcar_canfd: Add missing of_node_put() in rcar_canfd_probe() can: mcp251xfd: fix detection of mcp251863 Documentation: fix udp_wmem_min in ip-sysctl.rst ...
2022-07-21mmu_gather: Force tlb-flush VM_PFNMAP vmasPeter Zijlstra1-16/+17
Jann reported a race between munmap() and unmap_mapping_range(), where unmap_mapping_range() will no-op once unmap_vmas() has unlinked the VMA; however munmap() will not yet have invalidated the TLBs. Therefore unmap_mapping_range() will complete while there are still (stale) TLB entries for the specified range. Mitigate this by force flushing TLBs for VM_PFNMAP ranges. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-07-21mmu_gather: Let there be one tlb_{start,end}_vma() implementationPeter Zijlstra1-13/+2
Now that architectures are no longer allowed to override tlb_{start,end}_vma() re-arrange code so that there is only one implementation for each of these functions. This much simplifies trying to figure out what they actually do. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-07-21mmu_gather: Remove per arch tlb_{start,end}_vma()Peter Zijlstra1-2/+19
Scattered across the archs are 3 basic forms of tlb_{start,end}_vma(). Provide two new MMU_GATHER_knobs to enumerate them and remove the per arch tlb_{start,end}_vma() implementations. - MMU_GATHER_NO_FLUSH_CACHE indicates the arch has flush_cache_range() but does *NOT* want to call it for each VMA. - MMU_GATHER_MERGE_VMAS indicates the arch wants to merge the invalidate across multiple VMAs if possible. With these it is possible to capture the three forms: 1) empty stubs; select MMU_GATHER_NO_FLUSH_CACHE and MMU_GATHER_MERGE_VMAS 2) start: flush_cache_range(), end: empty; select MMU_GATHER_MERGE_VMAS 3) start: flush_cache_range(), end: flush_tlb_range(); default Obviously, if the architecture does not have flush_cache_range() then it also doesn't need to select MMU_GATHER_NO_FLUSH_CACHE. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will@kernel.org> Cc: David Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-07-21net/cdc_ncm: Increase NTB max RX/TX values to 64kbŁukasz Spintzyk1-2/+2
DisplayLink ethernet devices require NTB buffers larger then 32kb in order to run with highest performance. This patch is changing upper limit of the rx and tx buffers. Those buffers are initialized with CDC_NCM_NTB_DEF_SIZE_RX and CDC_NCM_NTB_DEF_SIZE_TX which is 16kb so by default no device is affected by increased limit. Rx and tx buffer is increased under two conditions: - Device need to advertise that it supports higher buffer size in dwNtbMaxInMaxSize and dwNtbMaxOutMaxSize. - cdc_ncm/rx_max and cdc_ncm/tx_max driver parameters must be adjusted with udev rule or ethtool. Summary of testing and performance results: Tests were performed on following devices: - DisplayLink DL-3xxx family device - DisplayLink DL-6xxx family device - ASUS USB-C2500 2.5G USB3 ethernet adapter - Plugable USB3 1G USB3 ethernet adapter - EDIMAX EU-4307 USB-C ethernet adapter - Dell DBQBCBC064 USB-C ethernet adapter Performance measurements were done with: - iperf3 between two linux boxes - http://openspeedtest.com/ instance running on local test machine Insights from tests results: - All except one from third party usb adapters were not affected by increased buffer size to their advertised dwNtbOutMaxSize and dwNtbInMaxSize. Devices were generally reaching 912-940Mbps both download and upload. Only EDIMAX adapter experienced decreased download size from 929Mbps to 827Mbps with iper3, with openspeedtest decrease was from 968Mbps to 886Mbps. - DisplayLink DL-3xxx family devices experienced performance increase with iperf3 download from 300Mbps to 870Mbps and upload from 782Mbps to 844Mbps. With openspeedtest download increased from 556Mbps to 873Mbps and upload from 727Mbps to 973Mbps - DiplayLink DL-6xxx family devices are not affected by increased buffer size. Signed-off-by: Łukasz Spintzyk <lukasz.spintzyk@synaptics.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20220720060518.541-2-lukasz.spintzyk@synaptics.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-07-21Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-nextJakub Kicinski9-66/+115
Pablo Neira Ayuso says: ==================== Netfilter/IPVS updates for net-next The following patchset contains Netfilter/IPVS updates for net-next: 1) Simplify nf_ct_get_tuple(), from Jackie Liu. 2) Add format to request_module() call, from Bill Wendling. 3) Add /proc/net/stats/nf_flowtable to monitor in-flight pending hardware offload objects to be processed, from Vlad Buslov. 4) Missing rcu annotation and accessors in the netfilter tree, from Florian Westphal. 5) Merge h323 conntrack helper nat hooks into single object, also from Florian. 6) A batch of update to fix sparse warnings treewide, from Florian Westphal. 7) Move nft_cmp_fast_mask() where it used, from Florian. 8) Missing const in nf_nat_initialized(), from James Yonan. 9) Use bitmap API for Maglev IPVS scheduler, from Christophe Jaillet. 10) Use refcount_inc instead of _inc_not_zero in flowtable, from Florian Westphal. 11) Remove pr_debug in xt_TPROXY, from Nathan Cancellor. * git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next: netfilter: xt_TPROXY: remove pr_debug invocations netfilter: flowtable: prefer refcount_inc netfilter: ipvs: Use the bitmap API to allocate bitmaps netfilter: nf_nat: in nf_nat_initialized(), use const struct nf_conn * netfilter: nf_tables: move nft_cmp_fast_mask to where its used netfilter: nf_tables: use correct integer types netfilter: nf_tables: add and use BE register load-store helpers netfilter: nf_tables: use the correct get/put helpers netfilter: x_tables: use correct integer types netfilter: nfnetlink: add missing __be16 cast netfilter: nft_set_bitmap: Fix spelling mistake netfilter: h323: merge nat hook pointers into one netfilter: nf_conntrack: use rcu accessors where needed netfilter: nf_conntrack: add missing __rcu annotations netfilter: nf_flow_table: count pending offload workqueue tasks net/sched: act_ct: set 'net' pointer when creating new nf_flow_table netfilter: conntrack: use correct format characters netfilter: conntrack: use fallthrough to cleanup ==================== Link: https://lore.kernel.org/r/20220720230754.209053-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-21Merge tag 'mlx5-updates-2022-07-17' of ↵Jakub Kicinski1-1/+5
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2022-07-17 1) Add resiliency for lost completions for PTP TX port timestamp 2) Report Header-data split state via ethtool 3) Decouple HTB code from main regular TX code * tag 'mlx5-updates-2022-07-17' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5: CT: Remove warning of ignore_flow_level support for non PF net/mlx5e: Add resiliency for PTP TX port timestamp net/mlx5: Expose ts_cqe_metadata_size2wqe_counter net/mlx5e: HTB, move htb functions to a new file net/mlx5e: HTB, change functions name to follow convention net/mlx5e: HTB, remove priv from htb function calls net/mlx5e: HTB, hide and dynamically allocate mlx5e_htb structure net/mlx5e: HTB, move stats and max_sqs to priv net/mlx5e: HTB, move section comment to the right place net/mlx5e: HTB, move ids to selq_params struct net/mlx5e: HTB, reduce visibility of htb functions net/mlx5e: Fix mqprio_rl handling on devlink reload net/mlx5e: Report header-data split state through ethtool ==================== Link: https://lore.kernel.org/r/20220719203529.51151-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-20bpf: Fix bpf_trampoline_{,un}link_cgroup_shim ifdef guardsStanislav Fomichev1-3/+7
They were updated in kernel/bpf/trampoline.c to fix another build issue. We should to do the same for include/linux/bpf.h header. Fixes: 3908fcddc65d ("bpf: fix lsm_cgroup build errors on esoteric configs") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20220720155220.4087433-1-sdf@google.com
2022-07-20tools: Fixed MIPS builds due to struct flock re-definitionFlorian Fainelli1-0/+2
Building perf for MIPS failed after 9f79b8b72339 ("uapi: simplify __ARCH_FLOCK{,64}_PAD a little") with the following error: CC /home/fainelli/work/buildroot/output/bmips/build/linux-custom/tools/perf/trace/beauty/fcntl.o In file included from ../../../../host/mipsel-buildroot-linux-gnu/sysroot/usr/include/asm/fcntl.h:77, from ../include/uapi/linux/fcntl.h:5, from trace/beauty/fcntl.c:10: ../include/uapi/asm-generic/fcntl.h:188:8: error: redefinition of 'struct flock' struct flock { ^~~~~ In file included from ../include/uapi/linux/fcntl.h:5, from trace/beauty/fcntl.c:10: ../../../../host/mipsel-buildroot-linux-gnu/sysroot/usr/include/asm/fcntl.h:63:8: note: originally defined here struct flock { ^~~~~ This is due to the local copy under tools/include/uapi/asm-generic/fcntl.h including the toolchain's kernel headers which already define 'struct flock' and define HAVE_ARCH_STRUCT_FLOCK to future inclusions make a decision as to whether re-defining 'struct flock' is appropriate or not. Make sure what do not re-define 'struct flock' when HAVE_ARCH_STRUCT_FLOCK is already defined. Fixes: 9f79b8b72339 ("uapi: simplify __ARCH_FLOCK{,64}_PAD a little") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> [arnd: sync with include/uapi/asm-generic/fcntl.h as well] Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-07-20Merge tag 'linux-can-next-for-5.20-20220720' of ↵David S. Miller1-1/+19
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next Marc Kleine-Budde says: ==================== this is a pull request of 29 patches for net-next/master. The first 6 patches target the slcan driver. Dan Carpenter contributes a hardening patch, followed by 5 cleanup patches. Biju Das contributes 5 patches to prepare the sja1000 driver to support the Renesas RZ/N1 SJA1000 CAN controller. Dario Binacchi's patch for the slcan driver fixes a sleep with held spin lock. Another patch by Dario Binacchi fixes a wrong comment in the c_can driver. Pavel Pisa updates the CTU CAN FD IP core registers. Stephane Grosjean contributes 3 patches to the peak_usb driver for cleanups and support of a new MCU. The last 12 patches are by Vincent Mailhol, they fix and improve the txerr and rxerr reporting in all CAN drivers. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-20tcp: Fix data-races around sysctl_tcp_slow_start_after_idle.Kuniyuki Iwashima1-2/+2
While reading sysctl_tcp_slow_start_after_idle, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers. Fixes: 35089bb203f4 ("[TCP]: Add tcp_slow_start_after_idle sysctl.") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-20udp: Fix a data-race around sysctl_udp_l3mdev_accept.Kuniyuki Iwashima1-1/+1
While reading sysctl_udp_l3mdev_accept, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: 63a6fff353d0 ("net: Avoid receiving packets with an l3mdev on unbound UDP sockets") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-20ip: Fix data-races around sysctl_ip_prot_sock.Kuniyuki Iwashima1-1/+1
sysctl_ip_prot_sock is accessed concurrently, and there is always a chance of data-race. So, all readers and writers need some basic protection to avoid load/store-tearing. Fixes: 4548b683b781 ("Introduce a sysctl that modifies the value of PROT_SOCK.") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-20can: error: add definitions for the different CAN error thresholdsVincent Mailhol1-0/+13
Currently, drivers are using magic numbers to derive the CAN error states from the error counter. Add three macro declarations to remediate this. For reference, the error-active, error-passive and bus-off are defined in ISO 11898, section 12.1.4.2 "Error counting". Although ISO 11898 does not define error-warning state, this extra value is also commonly used and is thus also added. Link: https://lore.kernel.org/all/20220719143550.3681-13-mailhol.vincent@wanadoo.fr Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-07-20can: add CAN_ERR_CNT flag to notify availability of error counterVincent Mailhol1-0/+2
Add a dedicated flag in uapi/linux/can/error.h to notify the userland that fields data[6] and data[7] of the CAN error frame were respectively populated with the tx and rx error counters. For all driver tree-wide, set up this flags whenever needed. Link: https://lore.kernel.org/all/20220719143550.3681-12-mailhol.vincent@wanadoo.fr Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-07-20can: error: specify the values of data[5..7] of CAN error framesVincent Mailhol1-1/+4
Currently, data[5..7] of struct can_frame, when used as a CAN error frame, are defined as being "controller specific". Device specific behaviours are problematic because it prevents someone from writing code which is portable between devices. As a matter of fact, data[5] is never used, data[6] is always used to report TX error counter and data[7] is always used to report RX error counter. can-utils also relies on this. This patch updates the comment in the uapi header to specify that data[5] is reserved (and thus should not be used) and that data[6..7] are used for error counters. Fixes: 0d66548a10cb ("[CAN]: Add PF_CAN core module") Link: https://lore.kernel.org/all/20220719143550.3681-11-mailhol.vincent@wanadoo.fr Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-07-20net/sched: remove qdisc_root_lock() helperDavide Caratti1-19/+0
the last caller has been removed with commit 96f5e66e8a79 ("mac80211: fix aggregation for hardware with ampdu queues"), so it's safe to remove this function. Signed-off-by: Davide Caratti <dcaratti@redhat.com> Link: https://lore.kernel.org/r/703d549e3088367651d92a059743f1be848d74b7.1658133689.git.dcaratti@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-20Merge branch 'io_uring-zerocopy-send' of ↵Jakub Kicinski2-18/+53
git://git.kernel.org/pub/scm/linux/kernel/git/kuba/linux Pavel Begunkov says: ==================== io_uring zerocopy send The patchset implements io_uring zerocopy send. It works with both registered and normal buffers, mixing is allowed but not recommended. Apart from usual request completions, just as with MSG_ZEROCOPY, io_uring separately notifies the userspace when buffers are freed and can be reused (see API design below), which is delivered into io_uring's Completion Queue. Those "buffer-free" notifications are not necessarily per request, but the userspace has control over it and should explicitly attaching a number of requests to a single notification. The series also adds some internal optimisations when used with registered buffers like removing page referencing. From the kernel networking perspective there are two main changes. The first one is passing ubuf_info into the network layer from io_uring (inside of an in kernel struct msghdr). This allows extra optimisations, e.g. ubuf_info caching on the io_uring side, but also helps to avoid cross-referencing and synchronisation problems. The second part is an optional optimisation removing page referencing for requests with registered buffers. Benchmarking UDP with an optimised version of the selftest (see [1]), which sends a bunch of requests, waits for completions and repeats. "+ flush" column posts one additional "buffer-free" notification per request, and just "zc" doesn't post buffer notifications at all. NIC (requests / second): IO size | non-zc | zc | zc + flush 4000 | 495134 | 606420 (+22%) | 558971 (+12%) 1500 | 551808 | 577116 (+4.5%) | 565803 (+2.5%) 1000 | 584677 | 592088 (+1.2%) | 560885 (-4%) 600 | 596292 | 598550 (+0.4%) | 555366 (-6.7%) dummy (requests / second): IO size | non-zc | zc | zc + flush 8000 | 1299916 | 2396600 (+84%) | 2224219 (+71%) 4000 | 1869230 | 2344146 (+25%) | 2170069 (+16%) 1200 | 2071617 | 2361960 (+14%) | 2203052 (+6%) 600 | 2106794 | 2381527 (+13%) | 2195295 (+4%) Previously it also brought a massive performance speedup compared to the msg_zerocopy tool (see [3]), which is probably not super interesting. There is also an additional bunch of refcounting optimisations that was omitted from the series for simplicity and as they don't change the picture drastically, they will be sent as follow up, as well as flushing optimisations closing the performance gap b/w two last columns. For TCP on localhost (with hacks enabling localhost zerocopy) and including additional overhead for receive: IO size | non-zc | zc 1200 | 4174 | 4148 4096 | 7597 | 11228 Using a real NIC 1200 bytes, zc is worse than non-zc ~5-10%, maybe the omitted optimisations will somewhat help, should look better for 4000, but couldn't test properly because of setup problems. Links: liburing (benchmark + tests): [1] https://github.com/isilence/liburing/tree/zc_v4 kernel repo: [2] https://github.com/isilence/linux/tree/zc_v4 RFC v1: [3] https://lore.kernel.org/io-uring/cover.1638282789.git.asml.silence@gmail.com/ RFC v2: https://lore.kernel.org/io-uring/cover.1640029579.git.asml.silence@gmail.com/ Net patches based: git@github.com:isilence/linux.git zc_v4-net-base or https://github.com/isilence/linux/tree/zc_v4-net-base API design overview: The series introduces an io_uring concept of notifactors. From the userspace perspective it's an entity to which it can bind one or more requests and then requesting to flush it. Flushing a notifier makes it impossible to attach new requests to it, and instructs the notifier to post a completion once all requests attached to it are completed and the kernel doesn't need the buffers anymore. Notifications are stored in notification slots, which should be registered as an array in io_uring. Each slot stores only one notifier at any particular moment. Flushing removes it from the slot and the slot automatically replaces it with a new notifier. All operations with notifiers are done by specifying an index of a slot it's currently in. When registering a notification the userspace specifies a u64 tag for each slot, which will be copied in notification completion entries as cqe::user_data. cqe::res is 0 and cqe::flags is equal to wrap around u32 sequence number counting notifiers of a slot. ==================== Link: https://lore.kernel.org/r/cover.1657643355.git.asml.silence@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-20net: introduce __skb_fill_page_desc_noaccPavel Begunkov1-11/+17
Managed pages contain pinned userspace pages and controlled by upper layers, there is no need in tracking skb->pfmemalloc for them. Introduce a helper for filling frags but ignoring page tracking, it'll be needed later. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>