summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2022-04-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski4-9/+4
No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-08Merge tag 'net-5.18-rc2' of ↵Linus Torvalds2-3/+3
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from bpf and netfilter. Current release - new code bugs: - mctp: correct mctp_i2c_header_create result - eth: fungible: fix reference to __udivdi3 on 32b builds - eth: micrel: remove latencies support lan8814 Previous releases - regressions: - bpf: resolve to prog->aux->dst_prog->type only for BPF_PROG_TYPE_EXT - vrf: fix packet sniffing for traffic originating from ip tunnels - rxrpc: fix a race in rxrpc_exit_net() - dsa: revert "net: dsa: stop updating master MTU from master.c" - eth: ice: fix MAC address setting Previous releases - always broken: - tls: fix slab-out-of-bounds bug in decrypt_internal - bpf: support dual-stack sockets in bpf_tcp_check_syncookie - xdp: fix coalescing for page_pool fragment recycling - ovs: fix leak of nested actions - eth: sfc: - add missing xdp queue reinitialization - fix using uninitialized xdp tx_queue - eth: ice: - clear default forwarding VSI during VSI release - fix broken IFF_ALLMULTI handling - synchronize_rcu() when terminating rings - eth: qede: confirm skb is allocated before using - eth: aqc111: fix out-of-bounds accesses in RX fixup - eth: slip: fix NPD bug in sl_tx_timeout()" * tag 'net-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (61 commits) drivers: net: slip: fix NPD bug in sl_tx_timeout() bpf: Adjust bpf_tcp_check_syncookie selftest to test dual-stack sockets bpf: Support dual-stack sockets in bpf_tcp_check_syncookie myri10ge: fix an incorrect free for skb in myri10ge_sw_tso net: usb: aqc111: Fix out-of-bounds accesses in RX fixup qede: confirm skb is allocated before using net: ipv6mr: fix unused variable warning with CONFIG_IPV6_PIMSM_V2=n net: phy: mscc-miim: reject clause 45 register accesses net: axiemac: use a phandle to reference pcs_phy dt-bindings: net: add pcs-handle attribute net: axienet: factor out phy_node in struct axienet_local net: axienet: setup mdio unconditionally net: sfc: fix using uninitialized xdp tx_queue rxrpc: fix a race in rxrpc_exit_net() net: openvswitch: fix leak of nested actions net: ethernet: mv643xx: Fix over zealous checking of_get_mac_address() net: openvswitch: don't send internal clone attribute to the userspace. net: micrel: Fix KS8851 Kconfig ice: clear cmd_type_offset_bsz for TX rings ice: xsk: fix VSI state check in ice_xsk_wakeup() ...
2022-04-08tcp: Add tracepoint for tcp_set_ca_statePing Gan2-9/+48
The congestion status of a tcp flow may be updated since there is congestion between tcp sender and receiver. It makes sense to add tracepoint for congestion status set function to summate cc status duration and evaluate the performance of network and congestion algorithm. the backgound of this patch is below. Link: https://github.com/iovisor/bcc/pull/3899 Signed-off-by: Ping Gan <jacky_gam_2001@163.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20220406010956.19656-1-jacky_gam_2001@163.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-08net-core: rx_otherhost_dropped to core_statsJeffrey Ji2-0/+7
Increment rx_otherhost_dropped counter when packet dropped due to mismatched dest MAC addr. An example when this drop can occur is when manually crafting raw packets that will be consumed by a user space application via a tap device. For testing purposes local traffic was generated using trafgen for the client and netcat to start a server Tested: Created 2 netns, sent 1 packet using trafgen from 1 to the other with "{eth(daddr=$INCORRECT_MAC...}", verified that iproute2 showed the counter was incremented. (Also had to modify iproute2 to show the stat, additional patch for that coming next.) Signed-off-by: Jeffrey Ji <jeffreyji@google.com> Reviewed-by: Brian Vazquez <brianvv@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20220406172600.1141083-1-jeffreyjilinux@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-08net: extract a few internals from netdevice.hJakub Kicinski1-70/+2
There's a number of functions and static variables used under net/core/ but not from the outside. We currently dump most of them into netdevice.h. That bad for many reasons: - netdevice.h is very cluttered, hard to figure out what the APIs are; - netdevice.h is very long; - we have to touch netdevice.h more which causes expensive incremental builds. Create a header under net/core/ and move some declarations. The new header is also a bit of a catch-all but that's fine, if we create more specific headers people will likely over-think where their declaration fit best. And end up putting them in netdevice.h, again. More work should be done on splitting netdevice.h into more targeted headers, but that'd be more time consuming so small steps. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-07Merge tag 'hyperv-fixes-signed-20220407' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv fixes from Wei Liu: - Correctly propagate coherence information for VMbus devices (Michael Kelley) - Disable balloon and memory hot-add on ARM64 temporarily (Boqun Feng) - Use barrier to prevent reording when reading ring buffer (Michael Kelley) - Use virt_store_mb in favour of smp_store_mb (Andrea Parri) - Fix VMbus device object initialization (Andrea Parri) - Deactivate sysctl_record_panic_msg on isolated guest (Andrea Parri) - Fix a crash when unloading VMbus module (Guilherme G. Piccoli) * tag 'hyperv-fixes-signed-20220407' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: Drivers: hv: vmbus: Replace smp_store_mb() with virt_store_mb() Drivers: hv: balloon: Disable balloon and hot-add accordingly Drivers: hv: balloon: Support status report for larger page sizes Drivers: hv: vmbus: Prevent load re-ordering when reading ring buffer PCI: hv: Propagate coherence from VMbus device to PCI device Drivers: hv: vmbus: Propagate VMbus coherence to each VMbus device Drivers: hv: vmbus: Fix potential crash on module unload Drivers: hv: vmbus: Fix initialization of device object in vmbus_device_register() Drivers: hv: vmbus: Deactivate sysctl_record_panic_msg by default in isolated guests
2022-04-07ipv6: fix locking issues with loops over idev->addr_listNiels Dossche1-0/+8
idev->addr_list needs to be protected by idev->lock. However, it is not always possible to do so while iterating and performing actions on inet6_ifaddr instances. For example, multiple functions (like addrconf_{join,leave}_anycast) eventually call down to other functions that acquire the idev->lock. The current code temporarily unlocked the idev->lock during the loops, which can cause race conditions. Moving the locks up is also not an appropriate solution as the ordering of lock acquisition will be inconsistent with for example mc_lock. This solution adds an additional field to inet6_ifaddr that is used to temporarily add the instances to a temporary list while holding idev->lock. The temporary list can then be traversed without holding idev->lock. This change was done in two places. In addrconf_ifdown, the list_for_each_entry_safe variant of the list loop is also no longer necessary as there is no deletion within that specific loop. Suggested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Niels Dossche <dossche.niels@gmail.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/20220403231523.45843-1-dossche.niels@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-07Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfJakub Kicinski1-1/+3
Alexei Starovoitov says: ==================== pull-request: bpf 2022-04-06 We've added 8 non-merge commits during the last 8 day(s) which contain a total of 9 files changed, 139 insertions(+), 36 deletions(-). The main changes are: 1) rethook related fixes, from Jiri and Masami. 2) Fix the case when tracing bpf prog is attached to struct_ops, from Martin. 3) Support dual-stack sockets in bpf_tcp_check_syncookie, from Maxim. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: bpf: Adjust bpf_tcp_check_syncookie selftest to test dual-stack sockets bpf: Support dual-stack sockets in bpf_tcp_check_syncookie bpf: selftests: Test fentry tracing a struct_ops program bpf: Resolve to prog->aux->dst_prog->type only for BPF_PROG_TYPE_EXT rethook: Fix to use WRITE_ONCE() for rethook:: Handler selftests/bpf: Fix warning comparing pointer to 0 bpf: Fix sparse warnings in kprobe_multi_resolve_syms bpftool: Explicit errno handling in skeletons ==================== Link: https://lore.kernel.org/r/20220407031245.73026-1-alexei.starovoitov@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-06tcp: add accessors to read/set tp->snd_cwndEric Dumazet2-5/+16
We had various bugs over the years with code breaking the assumption that tp->snd_cwnd is greater than zero. Lately, syzbot reported the WARN_ON_ONCE(!tp->prior_cwnd) added in commit 8b8a321ff72c ("tcp: fix zero cwnd in tcp_cwnd_reduction") can trigger, and without a repro we would have to spend considerable time finding the bug. Instead of complaining too late, we want to catch where and when tp->snd_cwnd is set to an illegal value. Signed-off-by: Eric Dumazet <edumazet@google.com> Suggested-by: Yuchung Cheng <ycheng@google.com> Cc: Neal Cardwell <ncardwell@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Link: https://lore.kernel.org/r/20220405233538.947344-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-06net: ethernet: mtk_eth_soc: implement flow offloading to WED devicesFelix Fietkau1-0/+7
This allows hardware flow offloading from Ethernet to WLAN on MT7622 SoC Co-developed-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-06net: ethernet: mtk_eth_soc: add support for Wireless Ethernet Dispatch (WED)Felix Fietkau1-0/+131
The Wireless Ethernet Dispatch subsystem on the MT7622 SoC can be configured to intercept and handle access to the DMA queues and PCIe interrupts for a MT7615/MT7915 wireless card. It can manage the internal WDMA (Wireless DMA) controller, which allows ethernet packets to be passed from the packet switch engine (PSE) to the wireless card, bypassing the CPU entirely. This can be used to implement hardware flow offloading from ethernet to WLAN. Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-06net, uapi: remove inclusion of arpa/inet.hNick Desaulniers1-16/+12
In include/uapi/linux/tipc_config.h, there's a comment that it includes arpa/inet.h for ntohs; but ntohs is not defined in any UAPI header. For now, reuse the definitions from include/linux/byteorder/generic.h, since the various conversion functions do exist in UAPI headers: include/uapi/linux/byteorder/big_endian.h include/uapi/linux/byteorder/little_endian.h We would like to get to the point where we can build UAPI header tests with -nostdinc, meaning that kernel UAPI headers should not have a circular dependency on libc headers. Link: https://android-review.googlesource.com/c/platform/bionic/+/2048127 Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-06net: remove noblock parameter from skb_recv_datagram()Oliver Hartkopp1-2/+1
skb_recv_datagram() has two parameters 'flags' and 'noblock' that are merged inside skb_recv_datagram() by 'flags | (noblock ? MSG_DONTWAIT : 0)' As 'flags' may contain MSG_DONTWAIT as value most callers split the 'flags' into 'flags' and 'noblock' with finally obsolete bit operations like this: skb_recv_datagram(sk, flags & ~MSG_DONTWAIT, flags & MSG_DONTWAIT, &rc); And this is not even done consistently with the 'flags' parameter. This patch removes the obsolete and costly splitting into two parameters and only performs bit operations when really needed on the caller side. One missing conversion thankfully reported by kernel test robot. I missed to enable kunit tests to build the mctp code. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-06net: ensure net_todo_list is processed quicklyJohannes Berg1-1/+2
In [1], Will raised a potential issue that the cfg80211 code, which does (from a locking perspective) rtnl_lock() wiphy_lock() rtnl_unlock() might be suspectible to ABBA deadlocks, because rtnl_unlock() calls netdev_run_todo(), which might end up calling rtnl_lock() again, which could then deadlock (see the comment in the code added here for the scenario). Some back and forth and thinking ensued, but clearly this can't happen if the net_todo_list is empty at the rtnl_unlock() here. Clearly, the code here cannot actually put an entry on it, and all other users of rtnl_unlock() will empty it since that will always go through netdev_run_todo(), emptying the list. So the only other way to get there would be to add to the list and then unlock the RTNL without going through rtnl_unlock(), which is only possible through __rtnl_unlock(). However, this isn't exported and not used in many places, and none of them seem to be able to unregister before using it. Therefore, add a WARN_ON() in the code to ensure this invariant won't be broken, so that the cfg80211 (or any similar) code stays safe. [1] https://lore.kernel.org/r/Yjzpo3TfZxtKPMAG@google.com Signed-off-by: Johannes Berg <johannes.berg@intel.com> Link: https://lore.kernel.org/r/20220404113847.0ee02e4a70da.Ic73d206e217db20fd22dcec14fe5442ca732804b@changeid Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-05Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds1-6/+0
Pull virtio fixes from Michael Tsirkin: "Fixes and cleanups: - A couple of mlx5 fixes related to cvq - A couple of reverts dropping useless code (code that used it got reverted earlier)" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: vdpa: mlx5: synchronize driver status with CVQ vdpa: mlx5: prevent cvq work from hogging CPU Revert "virtio_config: introduce a new .enable_cbs method" Revert "virtio: use virtio_device_ready() in virtio_device_restore()"
2022-04-03Merge tag 'trace-v5.18-2' of ↵Linus Torvalds11-68/+29
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull more tracing updates from Steven Rostedt: - Rename the staging files to give them some meaning. Just stage1,stag2,etc, does not show what they are for - Check for NULL from allocation in bootconfig - Hold event mutex for dyn_event call in user events - Mark user events to broken (to work on the API) - Remove eBPF updates from user events - Remove user events from uapi header to keep it from being installed. - Move ftrace_graph_is_dead() into inline as it is called from hot paths and also convert it into a static branch. * tag 'trace-v5.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Move user_events.h temporarily out of include/uapi ftrace: Make ftrace_graph_is_dead() a static branch tracing: Set user_events to BROKEN tracing/user_events: Remove eBPF interfaces tracing/user_events: Hold event_mutex during dyn_event_add proc: bootconfig: Add null pointer check tracing: Rename the staging files for trace_events
2022-04-03Merge tag 'core-urgent-2022-04-03' of ↵Linus Torvalds1-3/+0
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RT signal fix from Thomas Gleixner: "Revert the RT related signal changes. They need to be reworked and generalized" * tag 'core-urgent-2022-04-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Revert "signal, x86: Delay calling signals in atomic on RT enabled kernels"
2022-04-03Merge tag 'dma-mapping-5.18-1' of git://git.infradead.org/users/hch/dma-mappingLinus Torvalds2-131/+1
Pull more dma-mapping updates from Christoph Hellwig: - fix a regression in dma remap handling vs AMD memory encryption (me) - finally kill off the legacy PCI DMA API (Christophe JAILLET) * tag 'dma-mapping-5.18-1' of git://git.infradead.org/users/hch/dma-mapping: dma-mapping: move pgprot_decrypted out of dma_pgprot PCI/doc: cleanup references to the legacy PCI DMA API PCI: Remove the deprecated "pci-dma-compat.h" API
2022-04-02Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2-23/+48
Pull kvm fixes from Paolo Bonzini: - Only do MSR filtering for MSRs accessed by rdmsr/wrmsr - Documentation improvements - Prevent module exit until all VMs are freed - PMU Virtualization fixes - Fix for kvm_irq_delivery_to_apic_fast() NULL-pointer dereferences - Other miscellaneous bugfixes * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (42 commits) KVM: x86: fix sending PV IPI KVM: x86/mmu: do compare-and-exchange of gPTE via the user address KVM: x86: Remove redundant vm_entry_controls_clearbit() call KVM: x86: cleanup enter_rmode() KVM: x86: SVM: fix tsc scaling when the host doesn't support it kvm: x86: SVM: remove unused defines KVM: x86: SVM: move tsc ratio definitions to svm.h KVM: x86: SVM: fix avic spec based definitions again KVM: MIPS: remove reference to trap&emulate virtualization KVM: x86: document limitations of MSR filtering KVM: x86: Only do MSR filtering when access MSR by rdmsr/wrmsr KVM: x86/emulator: Emulate RDPID only if it is enabled in guest KVM: x86/pmu: Fix and isolate TSX-specific performance event logic KVM: x86: mmu: trace kvm_mmu_set_spte after the new SPTE was set KVM: x86/svm: Clear reserved bits written to PerfEvtSeln MSRs KVM: x86: Trace all APICv inhibit changes and capture overall status KVM: x86: Add wrappers for setting/clearing APICv inhibits KVM: x86: Make APICv inhibit reasons an enum and cleanup naming KVM: X86: Handle implicit supervisor access with SMAP KVM: X86: Rename variable smap to not_smap in permission_fault() ...
2022-04-02tracing: mark user_events as BROKENSteven Rostedt (Google)1-0/+0
After being merged, user_events become more visible to a wider audience that have concerns with the current API. It is too late to fix this for this release, but instead of a full revert, just mark it as BROKEN (which prevents it from being selected in make config). Then we can work finding a better API. If that fails, then it will need to be completely reverted. To not have the code silently bitrot, still allow building it with COMPILE_TEST. And to prevent the uapi header from being installed, then later changed, and then have an old distro user space see the old version, move the header file out of the uapi directory. Surround the include with CONFIG_COMPILE_TEST to the current location, but when the BROKEN tag is taken off, it will use the uapi directory, and fail to compile. This is a good way to remind us to move the header back. Link: https://lore.kernel.org/all/20220330155835.5e1f6669@gandalf.local.home Link: https://lkml.kernel.org/r/20220330201755.29319-1-mathieu.desnoyers@efficios.com Suggested-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-02tracing: Move user_events.h temporarily out of include/uapiSteven Rostedt (Google)1-0/+0
While user_events API is under development and has been marked for broken to not let the API become fixed, move the header file out of the uapi directory. This is to prevent it from being installed, then later changed, and then have an old distro user space update with a new kernel, where applications see the user_events being available, but the old header is in place, and then they get compiled incorrectly. Also, surround the include with CONFIG_COMPILE_TEST to the current location, but when the BROKEN tag is taken off, it will use the uapi directory, and fail to compile. This is a good way to remind us to move the header back. Link: https://lore.kernel.org/all/20220330155835.5e1f6669@gandalf.local.home Link: https://lkml.kernel.org/r/20220330201755.29319-1-mathieu.desnoyers@efficios.com Link: https://lkml.kernel.org/r/20220401143903.188384f3@gandalf.local.home Suggested-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-04-02ftrace: Make ftrace_graph_is_dead() a static branchChristophe Leroy1-1/+15
ftrace_graph_is_dead() is used on hot paths, it just reads a variable in memory and is not worth suffering function call constraints. For instance, at entry of prepare_ftrace_return(), inlining it avoids saving prepare_ftrace_return() parameters to stack and restoring them after calling ftrace_graph_is_dead(). While at it using a static branch is even more performant and is rather well adapted considering that the returned value will almost never change. Inline ftrace_graph_is_dead() and replace 'kill_ftrace_graph' bool by a static branch. The performance improvement is noticeable. Link: https://lkml.kernel.org/r/e0411a6a0ed3eafff0ad2bc9cd4b0e202b4617df.1648623570.git.christophe.leroy@csgroup.eu Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-04-02tracing/user_events: Remove eBPF interfacesBeau Belgrave1-53/+0
Remove eBPF interfaces within user_events to ensure they are fully reviewed. Link: https://lore.kernel.org/all/20220329165718.GA10381@kbox/ Link: https://lkml.kernel.org/r/20220329173051.10087-1-beaub@linux.microsoft.com Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com> Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-04-02tracing: Rename the staging files for trace_eventsSteven Rostedt (Google)9-14/+14
When looking for implementation of different phases of the creation of the TRACE_EVENT() macro, it is pretty useless when all helper macro redefinitions are in files labeled "stageX_defines.h". Rename them to state which phase the files are for. For instance, when looking for the defines that are used to create the event fields, seeing "stage4_event_fields.h" gives the developer a good idea that the defines are in that file. Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-04-02KVM: Remove dirty handling from gfn_to_pfn_cache completelyDavid Woodhouse2-10/+5
It isn't OK to cache the dirty status of a page in internal structures for an indefinite period of time. Any time a vCPU exits the run loop to userspace might be its last; the VMM might do its final check of the dirty log, flush the last remaining dirty pages to the destination and complete a live migration. If we have internal 'dirty' state which doesn't get flushed until the vCPU is finally destroyed on the source after migration is complete, then we have lost data because that will escape the final copy. This problem already exists with the use of kvm_vcpu_unmap() to mark pages dirty in e.g. VMX nesting. Note that the actual Linux MM already considers the page to be dirty since we have a writeable mapping of it. This is just about the KVM dirty logging. For the nesting-style use cases (KVM_GUEST_USES_PFN) we will need to track which gfn_to_pfn_caches have been used and explicitly mark the corresponding pages dirty before returning to userspace. But we would have needed external tracking of that anyway, rather than walking the full list of GPCs to find those belonging to this vCPU which are dirty. So let's rely *solely* on that external tracking, and keep it simple rather than laying a tempting trap for callers to fall into. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220303154127.202856-3-dwmw2@infradead.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-02KVM: Use enum to track if cached PFN will be used in guest and/or hostSean Christopherson2-10/+16
Replace the guest_uses_pa and kernel_map booleans in the PFN cache code with a unified enum/bitmask. Using explicit names makes it easier to review and audit call sites. Opportunistically add a WARN to prevent passing garbage; instantating a cache without declaring its usage is either buggy or pointless. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220303154127.202856-2-dwmw2@infradead.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-02KVM: Don't actually set a request when evicting vCPUs for GFN cache invdSean Christopherson1-7/+31
Don't actually set a request bit in vcpu->requests when making a request purely to force a vCPU to exit the guest. Logging a request but not actually consuming it would cause the vCPU to get stuck in an infinite loop during KVM_RUN because KVM would see the pending request and bail from VM-Enter to service the request. Note, it's currently impossible for KVM to set KVM_REQ_GPC_INVALIDATE as nothing in KVM is wired up to set guest_uses_pa=true. But, it'd be all too easy for arch code to introduce use of kvm_gfn_to_pfn_cache_init() without implementing handling of the request, especially since getting test coverage of MMU notifier interaction with specific KVM features usually requires a directed test. Opportunistically rename gfn_to_pfn_cache_invalidate_start()'s wake_vcpus to evict_vcpus. The purpose of the request is to get vCPUs out of guest mode, it's supposed to _avoid_ waking vCPUs that are blocking. Opportunistically rename KVM_REQ_GPC_INVALIDATE to be more specific as to what it wants to accomplish, and to genericize the name so that it can used for similar but unrelated scenarios, should they arise in the future. Add a comment and documentation to explain why the "no action" request exists. Add compile-time assertions to help detect improper usage. Use the inner assertless helper in the one s390 path that makes requests without a hardcoded request. Cc: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220223165302.3205276-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-02Merge branch 'work.misc' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs updates from Al Viro: "Assorted bits and pieces" * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: aio: drop needless assignment in aio_read() clean overflow checks in count_mounts() a bit seq_file: fix NULL pointer arithmetic warning uml/x86: use x86 load_unaligned_zeropad() asm/user.h: killed unused macros constify struct path argument of finish_automount()/do_add_mount() fs: Remove FIXME comment in generic_write_checks()
2022-04-02Merge tag 'for-5.18/drivers-2022-04-01' of git://git.kernel.dk/linux-blockLinus Torvalds2-2/+3
Pull block driver fixes from Jens Axboe: "Followup block driver updates and fixes for the 5.18-rc1 merge window. In detail: - NVMe pull request - Fix multipath hang when disk goes live over reconnect (Anton Eidelman) - fix RCU hole that allowed for endless looping in multipath round robin (Chris Leech) - remove redundant assignment after left shift (Colin Ian King) - add quirks for Samsung X5 SSDs (Monish Kumar R) - fix the read-only state for zoned namespaces with unsupposed features (Pankaj Raghav) - use a private workqueue instead of the system workqueue in nvmet (Sagi Grimberg) - allow duplicate NSIDs for private namespaces (Sungup Moon) - expose use_threaded_interrupts read-only in sysfs (Xin Hao)" - nbd minor allocation fix (Zhang) - drbd fixes and maintainer addition (Lars, Jakob, Christoph) - n64cart build fix (Jackie) - loop compat ioctl fix (Carlos) - misc fixes (Colin, Dongli)" * tag 'for-5.18/drivers-2022-04-01' of git://git.kernel.dk/linux-block: drbd: remove check of list iterator against head past the loop body drbd: remove usage of list iterator variable after loop nbd: fix possible overflow on 'first_minor' in nbd_dev_add() MAINTAINERS: add drbd co-maintainer drbd: fix potential silent data corruption loop: fix ioctl calls using compat_loop_info nvme-multipath: fix hang when disk goes live over reconnect nvme: fix RCU hole that allowed for endless looping in multipath round robin nvme: allow duplicate NSIDs for private namespaces nvmet: remove redundant assignment after left shift nvmet: use a private workqueue instead of the system workqueue nvme-pci: add quirks for Samsung X5 SSDs nvme-pci: expose use_threaded_interrupts read-only in sysfs nvme: fix the read-only state for zoned namespaces with unsupposed features n64cart: convert bi_disk to bi_bdev->bd_disk fix build xen/blkfront: fix comment for need_copy xen-blkback: remove redundant assignment to variable i
2022-04-02Merge tag 'for-5.18/block-2022-04-01' of git://git.kernel.dk/linux-blockLinus Torvalds2-2/+5
Pull block fixes from Jens Axboe: "Either fixes or a few additions that got missed in the initial merge window pull. In detail: - List iterator fix to avoid leaking value post loop (Jakob) - One-off fix in minor count (Christophe) - Fix for a regression in how io priority setting works for an exiting task (Jiri) - Fix a regression in this merge window with blkg_free() being called in an inappropriate context (Ming) - Misc fixes (Ming, Tom)" * tag 'for-5.18/block-2022-04-01' of git://git.kernel.dk/linux-block: blk-wbt: remove wbt_track stub block: use dedicated list iterator variable block: Fix the maximum minor value is blk_alloc_ext_minor() block: restore the old set_task_ioprio() behaviour wrt PF_EXITING block: avoid calling blkg_free() in atomic context lib/sbitmap: allocate sb->map via kvzalloc_node
2022-04-02Merge tag 'for-5.18/io_uring-2022-04-01' of git://git.kernel.dk/linux-blockLinus Torvalds1-2/+0
Pull io_uring fixes from Jens Axboe: "A little bit all over the map, some regression fixes for this merge window, and some general fixes that are stable bound. In detail: - Fix an SQPOLL memory ordering issue (Almog) - Accept fixes (Dylan) - Poll fixes (me) - Fixes for provided buffers and recycling (me) - Tweak to IORING_OP_MSG_RING command added in this merge window (me) - Memory leak fix (Pavel) - Misc fixes and tweaks (Pavel, me)" * tag 'for-5.18/io_uring-2022-04-01' of git://git.kernel.dk/linux-block: io_uring: defer msg-ring file validity check until command issue io_uring: fail links if msg-ring doesn't succeeed io_uring: fix memory leak of uid in files registration io_uring: fix put_kbuf without proper locking io_uring: fix invalid flags for io_put_kbuf() io_uring: improve req fields comments io_uring: enable EPOLLEXCLUSIVE for accept poll io_uring: improve task work cache utilization io_uring: fix async accept on O_NONBLOCK sockets io_uring: remove IORING_CQE_F_MSG io_uring: add flag for disabling provided buffer recycling io_uring: ensure recv and recvmsg handle MSG_WAITALL correctly io_uring: don't recycle provided buffer if punted to async worker io_uring: fix assuming triggered poll waitqueue is the single poll io_uring: bump poll refs to full 31-bits io_uring: remove poll entry from list when canceling all io_uring: fix memory ordering when SQPOLL thread goes to sleep io_uring: ensure that fsnotify is always called io_uring: recycle provided before arming poll
2022-04-02Merge tag 'for-5.18/dm-fixes' of ↵Linus Torvalds1-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fixes from Mike Snitzer: - Fix DM integrity shrink crash due to journal entry not being marked unused. - Fix DM bio polling to handle possibility that underlying device(s) return BLK_STS_AGAIN during submission. - Fix dm_io and dm_target_io flags race condition on Alpha. - Add some pr_err debugging to help debug cases when DM ioctl structure is corrupted. * tag 'for-5.18/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm: fix bio polling to handle possibile BLK_STS_AGAIN dm: fix dm_io and dm_target_io flags race condition on Alpha dm integrity: set journal entry unused when shrinking device dm ioctl: log an error if the ioctl structure is corrupted
2022-04-01Merge tag 'folio-5.18d' of git://git.infradead.org/users/willy/pagecacheLinus Torvalds4-33/+21
Pull more filesystem folio updates from Matthew Wilcox: "A mixture of odd changes that didn't quite make it into the original pull and fixes for things that did. Also the readpages changes had to wait for the NFS tree to be pulled first. - Remove ->readpages infrastructure - Remove AOP_FLAG_CONT_EXPAND - Move read_descriptor_t to networking code - Pass the iocb to generic_perform_write - Minor updates to iomap, btrfs, ext4, f2fs, ntfs" * tag 'folio-5.18d' of git://git.infradead.org/users/willy/pagecache: btrfs: Remove a use of PAGE_SIZE in btrfs_invalidate_folio() ntfs: Correct mark_ntfs_record_dirty() folio conversion f2fs: Get the superblock from the mapping instead of the page f2fs: Correct f2fs_dirty_data_folio() conversion ext4: Correct ext4_journalled_dirty_folio() conversion filemap: Remove AOP_FLAG_CONT_EXPAND fs: Pass an iocb to generic_perform_write() fs, net: Move read_descriptor_t to net.h fs: Remove read_actor_t iomap: Simplify is_partially_uptodate a little readahead: Update comments mm: remove the skip_page argument to read_pages mm: remove the pages argument to read_pages fs: Remove ->readpages address space operation readahead: Remove read_cache_pages()
2022-04-01Merge tag 'xarray-5.18' of git://git.infradead.org/users/willy/xarrayLinus Torvalds1-0/+1
Pull XArray updates from Matthew Wilcox: - Documentation update - Fix test-suite build after move of bitmap.h - Fix xas_create_range() when a large entry is already present - Fix xas_split() of a shadow entry * tag 'xarray-5.18' of git://git.infradead.org/users/willy/xarray: XArray: Update the LRU list in xas_split() XArray: Fix xas_create_range() when multi-order entry present XArray: Include bitmap.h from xarray.h XArray: Document the locking requirement for the xa_state
2022-04-01Merge branch 'akpm' (patches from Andrew)Linus Torvalds1-3/+1
Merge still more updates from Andrew Morton: "16 patches. Subsystems affected by this patch series: ofs2, nilfs2, mailmap, and mm (madvise, mlock, mfence, memory-failure, kasan, debug, kmemleak, and damon)" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm/damon: prevent activated scheme from sleeping by deactivated schemes mm/kmemleak: reset tag when compare object pointer doc/vm/page_owner.rst: remove content related to -c option tools/vm/page_owner_sort.c: remove -c option mm, kasan: fix __GFP_BITS_SHIFT definition breaking LOCKDEP mm,hwpoison: unmap poisoned page before invalidation mailmap: update Kirill's email mm: kfence: fix objcgs vector allocation mm/munlock: protect the per-CPU pagevec by a local_lock_t mm/munlock: update Documentation/vm/unevictable-lru.rst mm/munlock: add lru_add_drain() to fix memcg_stat_test nilfs2: get rid of nilfs_mapping_init() nilfs2: fix lockdep warnings during disk space reclamation nilfs2: fix lockdep warnings in page operations for btree nodes ocfs2: fix crash when mount with quota enabled Revert "mm: madvise: skip unmapped vma holes passed to process_madvise"
2022-04-01mm, kasan: fix __GFP_BITS_SHIFT definition breaking LOCKDEPAndrey Konovalov1-3/+1
KASAN changes that added new GFP flags mistakenly updated __GFP_BITS_SHIFT as the total number of GFP bits instead of as a shift used to define __GFP_BITS_MASK. This broke LOCKDEP, as __GFP_BITS_MASK now gets the 25th bit enabled instead of the 28th for __GFP_NOLOCKDEP. Update __GFP_BITS_SHIFT to always count KASAN GFP bits. In the future, we could handle all combinations of KASAN and LOCKDEP to occupy as few bits as possible. For now, we have enough GFP bits to be inefficient in this quick fix. Link: https://lkml.kernel.org/r/462ff52742a1fcc95a69778685737f723ee4dfb3.1648400273.git.andreyknvl@google.com Fixes: 9353ffa6e9e9 ("kasan, page_alloc: allow skipping memory init for HW_TAGS") Fixes: 53ae233c30a6 ("kasan, page_alloc: allow skipping unpoisoning for HW_TAGS") Fixes: f49d9c5bb15c ("kasan, mm: only define ___GFP_SKIP_KASAN_POISON with HW_TAGS") Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Marco Elver <elver@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-01filemap: Remove AOP_FLAG_CONT_EXPANDMatthew Wilcox (Oracle)1-1/+0
This flag is no longer used, so remove it. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Al Viro <viro@zeniv.linux.org.uk>
2022-04-01fs: Pass an iocb to generic_perform_write()Matthew Wilcox (Oracle)1-1/+1
We can extract both the file pointer and the pos from the iocb. This simplifies each caller as well as allowing generic_perform_write() to see more of the iocb contents in the future. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christian Brauner <brauner@kernel.org> Reviewed-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Al Viro <viro@zeniv.linux.org.uk>
2022-04-01fs, net: Move read_descriptor_t to net.hMatthew Wilcox (Oracle)2-19/+19
fs.h has no more need for this typedef; networking is now the sole user of the read_descriptor_t. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Al Viro <viro@zeniv.linux.org.uk>
2022-04-01fs: Remove read_actor_tMatthew Wilcox (Oracle)1-3/+0
This typedef is not used any more. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Al Viro <viro@zeniv.linux.org.uk>
2022-04-01fs: Remove ->readpages address space operationMatthew Wilcox (Oracle)2-7/+1
All filesystems have now been converted to use ->readahead, so remove the ->readpages operation and fix all the comments that used to refer to it. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Al Viro <viro@zeniv.linux.org.uk>
2022-04-01readahead: Remove read_cache_pages()Matthew Wilcox (Oracle)1-2/+0
With no remaining users, remove this function and the related infrastructure. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Al Viro <viro@zeniv.linux.org.uk>
2022-04-01Merge tag 'sound-fix-5.18-rc1' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "Just a few fixes that have been gathered since the previous pull: - An additional fix for potential PCM deadlocks - A series of HD-audio CS8409 codec patches for new models - Other device specific fixes for HD-audio, ASoC mediatek, Intel, fsl, rockchip" * tag 'sound-fix-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: pcm: Fix potential AB/BA lock with buffer_mutex and mmap_lock ALSA: hda: Avoid unsol event during RPM suspending ALSA: hda/realtek: Fix audio regression on Mi Notebook Pro 2020 ALSA: hda/cs8409: Add new Dolphin HW variants ALSA: hda/cs8409: Disable HSBIAS_SENSE_EN for Cyborg ALSA: hda/cs8409: Support new Warlock MLK Variants ALSA: hda/cs8409: Fix Full Scale Volume setting for all variants ALSA: hda/cs8409: Re-order quirk table into ascending order ALSA: hda/cs8409: Fix Warlock to use mono mic configuration ALSA: cs4236: fix an incorrect NULL check on list iterator ALSA: hda/realtek: Enable headset mic on Lenovo P360 ASoC: SOF: Intel: Fix build error without SND_SOC_SOF_PCI_DEV ALSA: hda/realtek: Add mute and micmut LED support for Zbook Fury 17 G9 ASoC: rockchip: i2s_tdm: Fixup config for SND_SOC_DAIFMT_DSP_A/B ASoC: fsl-asoc-card: Fix jack_event() always return 0 ASoC: mediatek: mt6358: add missing EXPORT_SYMBOLs
2022-04-01Merge tag 'gpio-fixes-for-v5.18-rc1' of ↵Linus Torvalds1-5/+8
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux Pull gpio fixes from Bartosz Golaszewski: - grammar and formatting fixes in comments for gpio-ts4900 - correct links in gpio-ts5500 - fix a warning in doc generation for the core GPIO documentation * tag 'gpio-fixes-for-v5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: gpio: ts5500: Fix Links to Technologic Systems web resources gpio: Properly document parent data union gpio: ts4900: Fix comment formatting and grammar
2022-04-01dm: fix dm_io and dm_target_io flags race condition on AlphaMikulas Patocka1-0/+2
Early alpha processors cannot write a single byte or short; they read 8 bytes, modify the value in registers and write back 8 bytes. This could cause race condition in the structure dm_io - if the fields flags and io_count are modified simultaneously. Fix this bug by using 32-bit flags if we are on Alpha and if we are compiling for a processor that doesn't have the byte-word-extension. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Fixes: bd4a6dd241ae ("dm: reduce size of dm_io and dm_target_io structs") [snitzer: Jens allowed this change since Mikulas owns a relevant Alpha!] Acked-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2022-04-01Merge branch 'for-linus' of ↵Linus Torvalds2-0/+29
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input updates from Dmitry Torokhov: - a revert of a patch resetting extra buttons on touchpads claiming to be buttonpads as this caused regression on certain Dell devices - a new driver for Mediatek MT6779 keypad - a new driver for Imagis touchscreen - rework of Google/Chrome OS "Vivaldi" keyboard handling - assorted driver fixes. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (31 commits) Revert "Input: clear BTN_RIGHT/MIDDLE on buttonpads" Input: adi - remove redundant variable z Input: add Imagis touchscreen driver dt-bindings: input/touchscreen: bindings for Imagis Input: synaptics - enable InterTouch on ThinkPad T14/P14s Gen 1 AMD Input: stmfts - fix reference leak in stmfts_input_open Input: add bounds checking to input_set_capability() Input: iqs5xx - use local input_dev pointer HID: google: modify HID device groups of eel HID: google: Add support for vivaldi to hid-hammer HID: google: extract Vivaldi hid feature mapping for use in hid-hammer Input: extract ChromeOS vivaldi physmap show function HID: google: switch to devm when registering keyboard backlight LED Input: mt6779-keypad - fix signedness bug Input: mt6779-keypad - add MediaTek keypad driver dt-bindings: input: Add bindings for Mediatek matrix keypad Input: da9063 - use devm_delayed_work_autocancel() Input: goodix - fix race on driver unbind Input: goodix - use input_copy_abs() helper Input: add input_copy_abs() function ...
2022-04-01Merge tag 'rtc-5.18' of ↵Linus Torvalds6-4/+16
git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux Pull RTC updates from Alexandre Belloni: "The bulk of the patches are about replacing the uie_unsupported struct rtc_device member by a feature bit. Subsystem: - remove uie_unsupported, all users have been converted to clear RTC_FEATURE_UPDATE_INTERRUPT and provide a reason - RTCs with an alarm with a resolution of a minute are now letting the core handle rounding down the alarm time - fix use-after-free on device removal New driver: - OP-TEE RTC PTA Drivers: - sun6i: Add H616 support - cmos: Fix the AltCentury for AMD platforms - spear: set range" * tag 'rtc-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (56 commits) rtc: check if __rtc_read_time was successful rtc: gamecube: Fix refcount leak in gamecube_rtc_read_offset_from_sram rtc: mc146818-lib: Fix the AltCentury for AMD platforms rtc: optee: add RTC driver for OP-TEE RTC PTA rtc: pm8xxx: Return -ENODEV if set_time disallowed rtc: pm8xxx: Attach wake irq to device clk: sunxi-ng: sun6i-rtc: include clk/sunxi-ng.h rtc: remove uie_unsupported rtc: xgene: stop using uie_unsupported rtc: hym8563: switch to RTC_FEATURE_UPDATE_INTERRUPT rtc: hym8563: let the core handle the alarm resolution rtc: hym8563: switch to devm_rtc_allocate_device rtc: efi: switch to RTC_FEATURE_UPDATE_INTERRUPT rtc: efi: switch to devm_rtc_allocate_device rtc: add new RTC_FEATURE_ALARM_WAKEUP_ONLY feature rtc: spear: fix spear_rtc_read_time rtc: spear: drop uie_unsupported rtc: spear: set range rtc: spear: switch to devm_rtc_allocate_device rtc: pcf8563: switch to RTC_FEATURE_UPDATE_INTERRUPT ...
2022-04-01mctp: Use output netdev to allocate skb headroomMatt Johnston1-2/+0
Previously the skb was allocated with headroom MCTP_HEADER_MAXLEN, but that isn't sufficient if we are using devs that are not MCTP specific. This also adds a check that the smctp_halen provided to sendmsg for extended addressing is the correct size for the netdev. Fixes: 833ef3b91de6 ("mctp: Populate socket implementation") Reported-by: Matthew Rinaldi <mjrinal@g.clemson.edu> Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-01Merge tag 'netfs-prep-20220318' of ↵Linus Torvalds4-106/+266
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull netfs updates from David Howells: "Netfs prep for write helpers. Having had a go at implementing write helpers and content encryption support in netfslib, it seems that the netfs_read_{,sub}request structs and the equivalent write request structs were almost the same and so should be merged, thereby requiring only one set of alloc/get/put functions and a common set of tracepoints. Merging the structs also has the advantage that if a bounce buffer is added to the request struct, a read operation can be performed to fill the bounce buffer, the contents of the buffer can be modified and then a write operation can be performed on it to send the data wherever it needs to go using the same request structure all the way through. The I/O handlers would then transparently perform any required crypto. This should make it easier to perform RMW cycles if needed. The potentially common functions and structs, however, by their names all proclaim themselves to be associated with the read side of things. The bulk of these changes alter this in the following ways: - Rename struct netfs_read_{,sub}request to netfs_io_{,sub}request. - Rename some enums, members and flags to make them more appropriate. - Adjust some comments to match. - Drop "read"/"rreq" from the names of common functions. For instance, netfs_get_read_request() becomes netfs_get_request(). - The ->init_rreq() and ->issue_op() methods become ->init_request() and ->issue_read(). I've kept the latter as a read-specific function and in another branch added an ->issue_write() method. The driver source is then reorganised into a number of files: fs/netfs/buffered_read.c Create read reqs to the pagecache fs/netfs/io.c Dispatchers for read and write reqs fs/netfs/main.c Some general miscellaneous bits fs/netfs/objects.c Alloc, get and put functions fs/netfs/stats.c Optional procfs statistics. and future development can be fitted into this scheme, e.g.: fs/netfs/buffered_write.c Modify the pagecache fs/netfs/buffered_flush.c Writeback from the pagecache fs/netfs/direct_read.c DIO read support fs/netfs/direct_write.c DIO write support fs/netfs/unbuffered_write.c Write modifications directly back Beyond the above changes, there are also some changes that affect how things work: - Make fscache_end_operation() generally available. - In the netfs tracing header, generate enums from the symbol -> string mapping tables rather than manually coding them. - Add a struct for filesystems that uses netfslib to put into their inode wrapper structs to hold extra state that netfslib is interested in, such as the fscache cookie. This allows netfslib functions to be set in filesystem operation tables and jumped to directly without having to have a filesystem wrapper. - Add a member to the struct added above to track the remote inode length as that may differ if local modifications are buffered. We may need to supply an appropriate EOF pointer when storing data (in AFS for example). - Pass extra information to netfs_alloc_request() so that the ->init_request() hook can access it and retain information to indicate the origin of the operation. - Make the ->init_request() hook return an error, thereby allowing a filesystem that isn't allowed to cache an inode (ceph or cifs, for example) to skip readahead. - Switch to using refcount_t for subrequests and add tracepoints to log refcount changes for the request and subrequest structs. - Add a function to consolidate dispatching a read request. Similar code is used in three places and another couple are likely to be added in the future" Link: https://lore.kernel.org/all/2639515.1648483225@warthog.procyon.org.uk/ * tag 'netfs-prep-20220318' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: afs: Maintain netfs_i_context::remote_i_size netfs: Keep track of the actual remote file size netfs: Split some core bits out into their own file netfs: Split fs/netfs/read_helper.c netfs: Rename read_helper.c to io.c netfs: Prepare to split read_helper.c netfs: Add a function to consolidate beginning a read netfs: Add a netfs inode context ceph: Make ceph_init_request() check caps on readahead netfs: Change ->init_request() to return an error code netfs: Refactor arguments for netfs_alloc_read_request netfs: Adjust the netfs_failure tracepoint to indicate non-subreq lines netfs: Trace refcounting on the netfs_io_subrequest struct netfs: Trace refcounting on the netfs_io_request struct netfs: Adjust the netfs_rreq tracepoint slightly netfs: Split netfs_io_* object handling out netfs: Finish off rename of netfs_read_request to netfs_io_request netfs: Rename netfs_read_*request to netfs_io_*request netfs: Generate enums from trace symbol mapping lists fscache: export fscache_end_operation()
2022-03-31Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds5-27/+99
Pull virtio updates from Michael Tsirkin: - vdpa generic device type support - more virtio hardening for broken devices (but on the same theme, revert some virtio hotplug hardening patches - they were misusing some interrupt flags and had to be reverted) - RSS support in virtio-net - max device MTU support in mlx5 vdpa - akcipher support in virtio-crypto - shared IRQ support in ifcvf vdpa - a minor performance improvement in vhost - enable virtio mem for ARM64 - beginnings of advance dma support - cleanups, fixes all over the place * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (33 commits) vdpa/mlx5: Avoid processing works if workqueue was destroyed vhost: handle error while adding split ranges to iotlb vdpa: support exposing the count of vqs to userspace vdpa: change the type of nvqs to u32 vdpa: support exposing the config size to userspace vdpa/mlx5: re-create forwarding rules after mac modified virtio: pci: check bar values read from virtio config space Revert "virtio_pci: harden MSI-X interrupts" Revert "virtio-pci: harden INTX interrupts" drivers/net/virtio_net: Added RSS hash report control. drivers/net/virtio_net: Added RSS hash report. drivers/net/virtio_net: Added basic RSS support. drivers/net/virtio_net: Fixed padded vheader to use v1 with hash. virtio: use virtio_device_ready() in virtio_device_restore() tools/virtio: compile with -pthread tools/virtio: fix after premapped buf support virtio_ring: remove flags check for unmap packed indirect desc virtio_ring: remove flags check for unmap split indirect desc virtio_ring: rename vring_unmap_state_packed() to vring_unmap_extra_packed() net/mlx5: Add support for configuring max device MTU ...