summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-06-12netlink: specs: support setting prefix-name per attributeJakub Kicinski3-2/+13
Ethtool's PSE PoDL has a attr nest with different prefixes: /* Power Sourcing Equipment */ enum { ETHTOOL_A_PSE_UNSPEC, ETHTOOL_A_PSE_HEADER, /* nest - _A_HEADER_* */ ETHTOOL_A_PODL_PSE_ADMIN_STATE, /* u32 */ ETHTOOL_A_PODL_PSE_ADMIN_CONTROL, /* u32 */ ETHTOOL_A_PODL_PSE_PW_D_STATUS, /* u32 */ Header has a prefix of ETHTOOL_A_PSE_ and other attrs prefix of ETHTOOL_A_PODL_PSE_ we can't cover them uniformly. If PODL was after PSE life would be easy. Now we either need to add prefixes to attr names which is yucky or support setting prefix name per attr. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-12tools: ynl-gen: record extra args for regenJakub Kicinski2-1/+8
ynl-regen needs to know the arguments used to generate a file. Record excluded ops and, while at it, user headers. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-12tools: ynl-gen: support excluding tricky opsJakub Kicinski2-5/+17
The ethtool family has a small handful of quite tricky ops and a lot of simple very useful ops. Teach ynl-gen to skip ops so that we can bypass the tricky ones. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-12mdio: mdio-mux-mmioreg: Use of_property_read_reg() to parse "reg"Rob Herring1-3/+4
Use the recently added of_property_read_reg() helper to get the untranslated "reg" address value. Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-12dt-bindings: net: drop unneeded quotesKrzysztof Kozlowski10-11/+11
Cleanup bindings dropping unneeded quotes. Once all these are fixed, checking for this can be enabled in yamllint. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Acked-by: Jernej Skrabec <jernej.skrabec@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-12Merge branch 'SCM_PIDFD-SCM_PEERPIDFD'David S. Miller16-13/+565
Alexander Mikhalitsyn says: ==================== Add SCM_PIDFD and SO_PEERPIDFD 1. Implement SCM_PIDFD, a new type of CMSG type analogical to SCM_CREDENTIALS, but it contains pidfd instead of plain pid, which allows programmers not to care about PID reuse problem. 2. Add SO_PEERPIDFD which allows to get pidfd of peer socket holder pidfd. This thing is direct analog of SO_PEERCRED which allows to get plain PID. 3. Add SCM_PIDFD / SO_PEERPIDFD kselftest Idea comes from UAPI kernel group: https://uapi-group.org/kernel-features/ Big thanks to Christian Brauner and Lennart Poettering for productive discussions about this and Luca Boccassi for testing and reviewing this. === Motivation behind this patchset Eric Dumazet raised a question: > It seems that we already can use pidfd_open() (since linux-5.3), and > pass the resulting fd in af_unix SCM_RIGHTS message ? Yes, it's possible, but it means that from the receiver side we need to trust the sent pidfd (in SCM_RIGHTS), or always use combination of SCM_RIGHTS+SCM_CREDENTIALS, then we can extract pidfd from SCM_RIGHTS, then acquire plain pid from pidfd and after compare it with the pid from SCM_CREDENTIALS. A few comments from other folks regarding this. Christian Brauner wrote: >Let me try and provide some of the missing background. >There are a range of use-cases where we would like to authenticate a >client through sockets without being susceptible to PID recycling >attacks. Currently, we can't do this as the race isn't fully fixable. >We can only apply mitigations. >What this patchset will allows us to do is to get a pidfd without the >client having to send us an fd explicitly via SCM_RIGHTS. As that's >already possibly as you correctly point out. >But for protocols like polkit this is quite important. Every message is >standalone and we would need to force a complete protocol change where >we would need to require that every client allocate and send a pidfd via >SCM_RIGHTS. That would also mean patching through all polkit users. >For something like systemd-journald where we provide logging facilities >and want to add metadata to the log we would also immensely benefit from >being able to get a receiver-side controlled pidfd. >With the message type we envisioned we don't need to change the sender >at all and can be safe against pid recycling. >Link: https://gitlab.freedesktop.org/polkit/polkit/-/merge_requests/154 >Link: https://uapi-group.org/kernel-features Lennart Poettering wrote: >So yes, this is of course possible, but it would mean the pidfd would >have to be transported as part of the user protocol, explicitly sent >by the sender. (Moreover, the receiver after receiving the pidfd would >then still have to somehow be able to prove that the pidfd it just >received actually refers to the peer's process and not some random >process. – this part is actually solvable in userspace, but ugly) >The big thing is simply that we want that the pidfd is associated >*implicity* with each AF_UNIX connection, not explicitly. A lot of >userspace already relies on this, both in the authentication area >(polkit) as well as in the logging area (systemd-journald). Right now >using the PID field from SO_PEERCREDS/SCM_CREDENTIALS is racy though >and very hard to get right. Making this available as pidfd too, would >solve this raciness, without otherwise changing semantics of it all: >receivers can still enable the creds stuff as they wish, and the data >is then implicitly appended to the connections/datagrams the sender >initiates. >Or to turn this around: things like polkit are typically used to >authenticate arbitrary dbus methods calls: some service implements a >dbus method call, and when an unprivileged client then issues that >call, it will take the client's info, go to polkit and ask it if this >is ok. If we wanted to send the pidfd as part of the protocol we >basically would have to extend every single method call to contain the >client's pidfd along with it as an additional argument, which would be >a massive undertaking: it would change the prototypes of basically >*all* methods a service defines… And that's just ugly. >Note that Alex' patch set doesn't expose anything that wasn't exposed >before, or attach, propagate what wasn't before. All it does, is make >the field already available anyway (the struct ucred .pid field) >available also in a better way (as a pidfd), to solve a variety of >races, with no effect on the protocol actually spoken within the >AF_UNIX transport. It's a seamless improvement of the status quo. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-12af_unix: Kconfig: make CONFIG_UNIX boolAlexander Mikhalitsyn1-5/+1
Let's make CONFIG_UNIX a bool instead of a tristate. We've decided to do that during discussion about SCM_PIDFD patchset [1]. [1] https://lore.kernel.org/lkml/20230524081933.44dc8bea@kernel.org/ Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Leon Romanovsky <leon@kernel.org> Cc: David Ahern <dsahern@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Kees Cook <keescook@chromium.org> Cc: Christian Brauner <brauner@kernel.org> Cc: Kuniyuki Iwashima <kuniyu@amazon.com> Cc: Lennart Poettering <mzxreary@0pointer.de> Cc: Luca Boccassi <bluca@debian.org> Cc: linux-kernel@vger.kernel.org Cc: netdev@vger.kernel.org Cc: linux-arch@vger.kernel.org Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com> Acked-by: Christian Brauner <brauner@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-12selftests: net: add SCM_PIDFD / SO_PEERPIDFD testAlexander Mikhalitsyn3-1/+433
Basic test to check consistency between: - SCM_CREDENTIALS and SCM_PIDFD - SO_PEERCRED and SO_PEERPIDFD Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Leon Romanovsky <leon@kernel.org> Cc: David Ahern <dsahern@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Kees Cook <keescook@chromium.org> Cc: Christian Brauner <brauner@kernel.org> Cc: Kuniyuki Iwashima <kuniyu@amazon.com> Cc: linux-kernel@vger.kernel.org Cc: netdev@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: linux-kselftest@vger.kernel.org Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-12net: core: add getsockopt SO_PEERPIDFDAlexander Mikhalitsyn8-0/+55
Add SO_PEERPIDFD which allows to get pidfd of peer socket holder pidfd. This thing is direct analog of SO_PEERCRED which allows to get plain PID. Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Leon Romanovsky <leon@kernel.org> Cc: David Ahern <dsahern@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Kees Cook <keescook@chromium.org> Cc: Christian Brauner <brauner@kernel.org> Cc: Kuniyuki Iwashima <kuniyu@amazon.com> Cc: Lennart Poettering <mzxreary@0pointer.de> Cc: Luca Boccassi <bluca@debian.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Stanislav Fomichev <sdf@google.com> Cc: bpf@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: netdev@vger.kernel.org Cc: linux-arch@vger.kernel.org Reviewed-by: Christian Brauner <brauner@kernel.org> Acked-by: Stanislav Fomichev <sdf@google.com> Tested-by: Luca Boccassi <bluca@debian.org> Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-12scm: add SO_PASSPIDFD and SCM_PIDFDAlexander Mikhalitsyn12-7/+76
Implement SCM_PIDFD, a new type of CMSG type analogical to SCM_CREDENTIALS, but it contains pidfd instead of plain pid, which allows programmers not to care about PID reuse problem. We mask SO_PASSPIDFD feature if CONFIG_UNIX is not builtin because it depends on a pidfd_prepare() API which is not exported to the kernel modules. Idea comes from UAPI kernel group: https://uapi-group.org/kernel-features/ Big thanks to Christian Brauner and Lennart Poettering for productive discussions about this. Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Leon Romanovsky <leon@kernel.org> Cc: David Ahern <dsahern@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Kees Cook <keescook@chromium.org> Cc: Christian Brauner <brauner@kernel.org> Cc: Kuniyuki Iwashima <kuniyu@amazon.com> Cc: Lennart Poettering <mzxreary@0pointer.de> Cc: Luca Boccassi <bluca@debian.org> Cc: linux-kernel@vger.kernel.org Cc: netdev@vger.kernel.org Cc: linux-arch@vger.kernel.org Tested-by: Luca Boccassi <bluca@debian.org> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-12Merge branch 'mlxsw-cleanups'David S. Miller6-81/+89
Petr Machata says: ==================== mlxsw: Cleanups in router code This patchset moves some router-related code from spectrum.c to spectrum_router.c where it should be. It also simplifies handlers of netevent notifications. - Patch #1 caches router pointer in a dedicated variable. This obviates the need to access the same as mlxsw_sp->router, making lines shorter, and permitting a future patch to add code that fits within 80 character limit. - Patch #2 moves IP / IPv6 validation notifier blocks from spectrum.c to spectrum_router, where the handlers are anyway. - In patch #3, pass router pointer to scheduler of deferred work directly, instead of having it deduce it on its own. - This makes the router pointer available in the handler function mlxsw_sp_router_netevent_event(), so in patch #4, use it directly, instead of finding it through mlxsw_sp_port. - In patch #5, extend mlxsw_sp_router_schedule_work() so that the NETEVENT_NEIGH_UPDATE handler can use it directly instead of inlining equivalent code. - In patches #6 and #7, add helpers for two common operations involving a backing netdev of a RIF. This makes it unnecessary for the function mlxsw_sp_rif_dev() to be visible outside of the router module, so in patch #8, hide it. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-12mlxsw: spectrum_router: Privatize mlxsw_sp_rif_dev()Petr Machata2-2/+1
Now that the external users of mlxsw_sp_rif_dev() have been converted in the preceding patches, make the function static. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-12mlxsw: Convert does-RIF-have-this-netdev queries to a dedicated helperPetr Machata3-9/+14
In a number of places, a netdevice underlying a RIF is obtained only to compare it to another pointer. In order to clean up the interface between the router and the other modules, add a new helper to specifically answer this question, and convert the relevant uses to this new interface. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-12mlxsw: Convert RIF-has-netdevice queries to a dedicated helperPetr Machata4-4/+10
In a number of places, a netdevice underlying a RIF is obtained only to check if it a NULL pointer. In order to clean up the interface between the router and the other modules, add a new helper to specifically answer this question, and convert the relevant uses to this new interface. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-12mlxsw: spectrum_router: Reuse work neighbor initialization in work schedulerPetr Machata1-19/+10
After the struct mlxsw_sp_netevent_work.n field initialization is moved here, the body of code that handles NETEVENT_NEIGH_UPDATE is almost identical to the one in the helper function. Therefore defer to the helper instead of inlining the equivalent. Note that previously, the code took and put a reference of the netdevice. The new code defers to mlxsw_sp_dev_lower_is_port() to obviate the need for taking the reference. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-12mlxsw: spectrum_router: Use the available router pointer for netevent handlingPetr Machata1-7/+12
This code handles NETEVENT_DELAY_PROBE_TIME_UPDATE, which is invoked every time the delay_probe_time changes. mlxsw router currently only maintains one timer, so the last delay_probe_time set wins. Currently, mlxsw uses mlxsw_sp_port_lower_dev_hold() to find a reference to the router. This is no longer necessary. But as a side effect, this makes sure that only updates to "interesting netdevices" (ones that have a physical netdevice lower) are projected. Retain that side effect by calling mlxsw_sp_port_dev_lower_find_rcu() and punting if there is none. Then just proceed using the router pointer that's already at hand in the helper. Note that previously, the code took and put a reference of the netdevice. Because the mlxsw_sp pointer is now obtained from the notifier block, the port pointer (non-) NULL-ness is all that's relevant, and the reference does not need to be taken anymore. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-12mlxsw: spectrum_router: Pass router to mlxsw_sp_router_schedule_work() directlyPetr Machata1-5/+6
Instead of passing a notifier block and deducing the router pointer from that in the helper, do that in the caller, and pass the result. In the following patches, the pointer will also be made useful in the caller. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-12mlxsw: spectrum_router: Move here inetaddr validator notifiersPetr Machata4-25/+25
The validation logic is already in the router code. Move there the notifier blocks themselves as well. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-12mlxsw: spectrum_router: mlxsw_sp_router_fini(): Extract a helper variablePetr Machata1-12/+13
Make mlxsw_sp_router_fini() more similar to the _init() function (and more concise) by extracting the `router' handle to a named variable and using that throughout. The availability of a dedicated `router' variable will come in handy in following patches. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-12net: openvswitch: add support for l4 symmetric hashingAaron Conole3-2/+13
Since its introduction, the ovs module execute_hash action allowed hash algorithms other than the skb->l4_hash to be used. However, additional hash algorithms were not implemented. This means flows requiring different hash distributions weren't able to use the kernel datapath. Now, introduce support for symmetric hashing algorithm as an alternative hash supported by the ovs module using the flow dissector. Output of flow using l4_sym hash: recirc_id(0),in_port(3),eth(),eth_type(0x0800), ipv4(dst=64.0.0.0/192.0.0.0,proto=6,frag=no), packets:30473425, bytes:45902883702, used:0.000s, flags:SP., actions:hash(sym_l4(0)),recirc(0xd) Some performance testing with no GRO/GSO, two veths, single flow: hash(l4(0)): 4.35 GBits/s hash(l4_sym(0)): 4.24 GBits/s Signed-off-by: Aaron Conole <aconole@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-12Merge branch 'taprio-xstats'David S. Miller3-22/+25
Vladimir Oltean says: ==================== Fixes for taprio xstats 1. Taprio classes correspond to TXQs, and thus, class stats are TXQ stats not TC stats. 2. Drivers reporting taprio xstats should report xstats for *this* taprio, not for a previous one. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-12net: enetc: reset taprio stats when taprio is deletedVladimir Oltean1-0/+9
Currently, the window_drop stats persist even if an incorrect Qdisc was removed from the interface and a new one is installed. This is because the enetc driver keeps the state, and that is persistent across multiple Qdiscs. To resolve the issue, clear all win_drop counters from all TX queues when the currently active Qdisc is removed. These counters are zero by default. The counters visible in ethtool -S are also affected, but I don't care very much about preserving those enough to keep them monotonically incrementing. Fixes: 4802fca8d1af ("net: enetc: report statistics counters for taprio") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-12net/sched: taprio: report class offload stats per TXQ, not per TCVladimir Oltean3-22/+16
The taprio Qdisc creates child classes per netdev TX queue, but taprio_dump_class_stats() currently reports offload statistics per traffic class. Traffic classes are groups of TXQs sharing the same dequeue priority, so this is incorrect and we shouldn't be bundling up the TXQ stats when reporting them, as we currently do in enetc. Modify the API from taprio to drivers such that they report TXQ offload stats and not TC offload stats. There is no change in the UAPI or in the global Qdisc stats. Fixes: 6c1adb650c8d ("net/sched: taprio: add netlink reporting for offload statistics counters") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-12nfc: nxp-nci: store __be16 value in __be16 variableSimon Horman1-1/+1
Use a __be16 variable to store the big endian value of header in nxp_nci_i2c_fw_read(). Flagged by Sparse as: .../i2c.c:113:22: warning: cast to restricted __be16 No functional changes intended. Compile tested only. Signed-off-by: Simon Horman <horms@kernel.org> Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-12net: mana: Add support for vlan taggingHaiyang Zhang1-2/+17
To support vlan, use MANA_LONG_PKT_FMT if vlan tag is present in TX skb. Then extract the vlan tag from the skb struct, and save it to tx_oob for the NIC to transmit. For vlan tags on the payload, they are accepted by the NIC too. For RX, extract the vlan tag from CQE and put it into skb. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-12sfc: Add devlink dev info support for EF10Martin Habets1-0/+9
Reuse the work done for EF100 to add devlink support for EF10. There is no devlink port support for EF10. Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-12net/sched: act_pedit: Use kmemdup() to replace kmalloc + memcpyJiapeng Chong1-3/+1
./net/sched/act_pedit.c:245:21-28: WARNING opportunity for kmemdup. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=5478 Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Reviewed-by: Pedro Tammela <pctammela@mojatatu.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-12ionic: add support for ethtool extended stat link_down_countNitya Sunkad3-0/+12
Following the example of 'commit 9a0f830f8026 ("ethtool: linkstate: add a statistic for PHY down events")', added support for link down events. Add callback ionic_get_link_ext_stats to ionic_ethtool.c to support link_down_count, a property of netdev that gets reported exclusively on physical link down events. Run ethtool -I <devname> to display the device link down count. Signed-off-by: Nitya Sunkad <nitya.sunkad@amd.com> Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-12Merge branch '100GbE' of ↵David S. Miller4-48/+84
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== ice: Improve miscellaneous interrupt code Jacob Keller says: This series improves the driver's use of the threaded IRQ and the communication between ice_misc_intr() and the ice_misc_intr_thread_fn() which was previously introduced by commit 1229b33973c7 ("ice: Add low latency Tx timestamp read"). First, a new custom enumerated return value is used instead of a boolean for ice_ptp_process_ts(). This significantly reduces the cognitive burden when reviewing the logic for this function, as the expected action is clear from the return value name. Second, the unconditional loop in ice_misc_intr_thread_fn() is removed, replacing it with a write to the Other Interrupt Cause register. This causes the MAC to trigger the Tx timestamp interrupt again. This makes it possible to safely use the ice_misc_intr_thread_fn() to handle other tasks beyond just the Tx timestamps. It is also easier to reason about since the thread function will exit cleanly if we do something like disable the interrupt and call synchronize_irq(). Third, refactor the handling for external timestamp events to use the miscellaneous thread function. This resolves an issue with the external time stamps getting blocked while processing the periodic work function task. Fourth, a simplification of the ice_misc_intr() function to always return IRQ_WAKE_THREAD, and schedule the ice service task in the ice_misc_intr_thread_fn() instead. Finally, the Other Interrupt Cause is kept disabled over the thread function processing, rather than immediately re-enabled. Special thanks to Michal Schmidt for the careful review of the series and pointing out my misunderstandings of the kernel IRQ code. It has been determined that the race outlined as being fixed in previous series was actually introduced by this series itself, which I've since corrected. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-12net: wwan: iosm: enable runtime pm support for 7560M Chetan Kumar6-4/+65
Adds runtime pm support for 7560. As part of probe procedure auto suspend is enabled and auto suspend delay is set to 5000 ms for runtime pm use. Later auto flag is set to power manage the device at run time. On successful communication establishment between host and device the device usage counter is dropped and request to put the device into sleep state (suspend). In TX path, the device usage counter is raised and device is moved out of sleep(resume) for data transmission. In RX path, if the device has some data to be sent it request host platform to change the power state by giving PCI PME message. Signed-off-by: M Chetan Kumar <m.chetan.kumar@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-12dt-bindings: net: xlnx,axi-ethernet: convert bindings document to yamlRadhey Shyam Pandey3-101/+184
Convert the bindings document for Xilinx AXI Ethernet Subsystem from txt to yaml. No changes to existing binding description. Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com> Signed-off-by: Sarath Babu Naidu Gaddam <sarath.babu.naidu.gaddam@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-11selftests: net: vxlan: Fix selftest regression after changes in iproute2.Vladimir Nikishkin1-3/+3
The iproute2 output that eventually landed upstream is different than the one used in this test, resulting in failures. Fix by adjusting the test to use iproute2's JSON output, which is more stable than regular output. Fixes: 305c04189997 ("selftests: net: vxlan: Add tests for vxlan nolocalbypass option.") Signed-off-by: Vladimir Nikishkin <vladimir@nikishkin.pw> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-10Merge branch 'renesas-rswitch-perf'David S. Miller2-23/+22
Yoshihiro Shimoda says: ==================== net: renesas: rswitch: Improve perfromance of TX/RX This patch series is based on net-next.git / main branch [1]. This patch series can improve perfromance of TX in a specific condition. The previous code used "global rate limiter" feature so that this is possible to cause performance down if we use multiple ports at the same time. To resolve this issue, use "hardware pause" features of GWCA and COMA. Note that this is not related to the ethernet PAUSE frames. < UDP TX by iperf3 > before: about 450Mbps on both tsn0 and tsn1 after: about 950Mbps on both tsn0 and tsn1 Also, this patch series can improve performance of RX by using napi_gro_receive(). < TCP RX by iperf > before: about 670Mbps on tsn0 after: about 840Mbps on tsn0 [1] The commit e06bd5e3adae ("Merge branch 'followup-fixes-for-the-dwmac-and-altera-lynx-conversion'") Changes from v3: https://lore.kernel.org/all/20230607015641.1724057-1-yoshihiro.shimoda.uh@renesas.com/ - Rebased on the latest net-next.git / main branch. - Added Reviewed-by in the patch 2/2. (Maciej, thanks!) - Fix typos in the commit description in the patch 2/2. Changes from v2: https://lore.kernel.org/all/20230606085558.1708766-1-yoshihiro.shimoda.uh@renesas.com/ - Rebased on the latest net-next.git / main branch. - Added Reviewed-by in the patch 1/2. (Maciej, thanks!) - Revise the commit description in the patch 2/2. - Add definition to remove magic hardcoded numbers in the patch 2/2. Changes from v1: https://lore.kernel.org/all/20230529080840.1156458-1-yoshihiro.shimoda.uh@renesas.com/ - Rebased on the latest net-next.git / main branch. - Use "hardware pause" feature instead of "per-queue limiter" feature. - Drop refactaring for "per-queue limiter". - Drop dt-bindings update because "hardware pause" doesn't need additional clock information. - Use napi_gro_receive() to improve RX performance. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-10net: renesas: rswitch: Use hardware pause featuresYoshihiro Shimoda2-22/+21
Since this driver used the "global rate limiter" feature of GWCA, the TX performance of each port was reduced when multiple ports transmitted frames simultaneously. To improve performance, remove the use of the "global rate limiter" feature and use "hardware pause" features of the following: - "per priority pause" of GWCA - "global pause" of COMA Note that these features are not related to the ethernet PAUSE frame. Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-10net: renesas: rswitch: Use napi_gro_receive() in RXYoshihiro Shimoda1-1/+1
This hardware can receive multiple frames so that using napi_gro_receive() instead of netif_receive_skb() gets good performance of RX. Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-10Merge branch 'sfc-tc-encap-actions-offload'Jakub Kicinski11-8/+1222
Edward Cree says: ==================== sfc: TC encap actions offload This series adds support for offloading TC tunnel_key set actions to the EF100 driver, supporting VxLAN and GENEVE tunnels over IPv4 or IPv6. ==================== Link: https://lore.kernel.org/r/cover.1686240142.git.ecree.xilinx@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10sfc: generate encap headers for TC offloadEdward Cree1-9/+185
Support constructing VxLAN and GENEVE headers, on either IPv4 or IPv6, using the neighbouring information obtained in encap->neigh to populate the Ethernet header. Note that the ef100 hardware does not insert UDP checksums when performing encap, so for IPv6 the remote endpoint will need to be configured with udp6zerocsumrx or equivalent. Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10sfc: neighbour lookup for TC encap action offloadEdward Cree8-6/+569
For each neighbour we're interested in, create a struct efx_neigh_binder object which has a list of all the encap_actions using it. When we receive a neighbouring update (through the netevent notifier), find the corresponding efx_neigh_binder and update all its users. Since the actual generation of encap headers is still only a stub, the resulting rules still get left on fallback actions. Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10sfc: MAE functions to create/update/delete encap headersEdward Cree2-2/+95
Besides the raw header data, also pass the tunnel type, so that the hardware knows it needs to update the IP Total Length and UDP Length fields (and corresponding checksums) for each packet. Also, populate the ENCAP_HEADER_ID field in efx_mae_alloc_action_set() with the fw_id returned from efx_mae_allocate_encap_md(). Reviewed-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com> Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10sfc: add function to atomically update a rule in the MAEEdward Cree2-0/+24
efx_mae_update_rule() changes the action-set-list attached to an MAE flow rule in the Action Rule Table. We will use this when neighbouring updates change encap actions. Reviewed-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com> Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10sfc: some plumbing towards TC encap action offloadEdward Cree5-3/+284
Create software objects to manage the metadata for encap actions that can be attached to TC rules. However, since we don't yet have the neighbouring information (needed to generate the Ethernet header), all rules with encap actions are marked as "unready" and thus insert the fallback action into hardware rather than actually offloading the encapsulation action. Reviewed-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com> Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10sfc: add fallback action-set-lists for TC offloadEdward Cree2-0/+77
When offloading a TC encap action, the action information for the hardware might not be "ready": if there's currently no neighbour entry available for the destination address, we can't construct the Ethernet header to prepend to the packet. In this case, we still offload the flow rule, but with its action-set-list ID pointing at a "fallback" action which simply delivers the packet to its default destination (as though no flow rule had matched), thus allowing software TC to handle it. Later, when we receive a neighbouring update that allows us to construct the encap header, the rule will become "ready" and we will update its action-set-list ID in hardware to point at the actual offloaded actions. This patch sets up these fallback ASLs, but does not yet use them. Reviewed-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com> Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10net: move gso declarations and functions to their own filesEric Dumazet46-365/+425
Move declarations into include/net/gso.h and code into net/core/gso.c Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Stanislav Fomichev <sdf@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20230608191738.3947077-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10Merge branch 'mptcp-unify-pm-interfaces'Jakub Kicinski4-82/+103
Matthieu Baerts says: ==================== mptcp: unify PM interfaces These patches from Geliang better isolate the two MPTCP path-managers by avoiding calling userspace PM functions from the in-kernel PM. Instead, new functions declared in pm.c directly dispatch to the right PM. In addition to have a clearer code, this also avoids a bit of duplicated checks. This is a refactoring, there is no behaviour change intended here. ==================== Link: https://lore.kernel.org/r/20230608-upstream-net-next-20230608-mptcp-unify-pm-interfaces-v1-0-b301717c9ff5@tessares.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10mptcp: unify pm set_flags interfacesGeliang Tang3-32/+51
This patch unifies the three PM set_flags() interfaces: mptcp_pm_nl_set_flags() in mptcp/pm_netlink.c for the in-kernel PM and mptcp_userspace_pm_set_flags() in mptcp/pm_userspace.c for the userspace PM. They'll be switched in the common PM infterface mptcp_pm_set_flags() in mptcp/pm.c based on whether token is NULL or not. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10mptcp: unify pm get_flags_and_ifindex_by_idGeliang Tang4-22/+24
This patch unifies the three PM get_flags_and_ifindex_by_id() interfaces: mptcp_pm_nl_get_flags_and_ifindex_by_id() in mptcp/pm_netlink.c for the in-kernel PM and mptcp_userspace_pm_get_flags_and_ifindex_by_id() in mptcp/pm_userspace.c for the userspace PM. They'll be switched in the common PM infterface mptcp_pm_get_flags_and_ifindex_by_id() in mptcp/pm.c based on whether mptcp_pm_is_userspace() or not. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10mptcp: unify pm get_local_id interfacesGeliang Tang3-21/+21
This patch unifies the three PM get_local_id() interfaces: mptcp_pm_nl_get_local_id() in mptcp/pm_netlink.c for the in-kernel PM and mptcp_userspace_pm_get_local_id() in mptcp/pm_userspace.c for the userspace PM. They'll be switched in the common PM infterface mptcp_pm_get_local_id() in mptcp/pm.c based on whether mptcp_pm_is_userspace() or not. Also put together the declarations of these three functions in protocol.h. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10mptcp: export local_addressGeliang Tang2-9/+9
Rename local_address() with "mptcp_" prefix and export it in protocol.h. This function will be re-used in the common PM code (pm.c) in the following commit. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10Merge tag 'wireless-next-2023-06-09' of ↵Jakub Kicinski173-6187/+33400
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next Kalle Valo says: ==================== wireless-next patches for v6.5 The second pull request for v6.5. We have support for three new Realtek chipsets, all from different generations. Shows how active Realtek development is right now, even older generations are being worked on. Note: We merged wireless into wireless-next to avoid complex conflicts between the trees. Major changes: rtl8xxxu - RTL8192FU support rtw89 - RTL8851BE support rtw88 - RTL8723DS support ath11k - Multiple Basic Service Set Identifier (MBSSID) and Enhanced MBSSID Advertisement (EMA) support in AP mode iwlwifi - support for segmented PNVM images and power tables - new vendor entries for PPAG (platform antenna gain) feature cfg80211/mac80211 - more Multi-Link Operation (MLO) support such as hardware restart - fixes for a potential work/mutex deadlock and with it beginnings of the previously discussed locking simplifications * tag 'wireless-next-2023-06-09' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (162 commits) wifi: rtlwifi: remove misused flag from HAL data wifi: rtlwifi: remove unused dualmac control leftovers wifi: rtlwifi: remove unused timer and related code wifi: rsi: Do not set MMC_PM_KEEP_POWER in shutdown wifi: rsi: Do not configure WoWlan in shutdown hook if not enabled wifi: brcmfmac: Detect corner error case earlier with log wifi: rtw89: 8852c: update RF radio A/B parameters to R63 wifi: rtw89: 8852c: update TX power tables to R63 with 6 GHz power type (3 of 3) wifi: rtw89: 8852c: update TX power tables to R63 with 6 GHz power type (2 of 3) wifi: rtw89: 8852c: update TX power tables to R63 with 6 GHz power type (1 of 3) wifi: rtw89: process regulatory for 6 GHz power type wifi: rtw89: regd: update regulatory map to R64-R40 wifi: rtw89: regd: judge 6 GHz according to chip and BIOS wifi: rtw89: refine clearing supported bands to check 2/5 GHz first wifi: rtw89: 8851b: configure CRASH_TRIGGER feature for 8851B wifi: rtw89: set TX power without precondition during setting channel wifi: rtw89: debug: txpwr table access only valid page according to chip wifi: rtw89: 8851b: enable hw_scan support wifi: cfg80211: move scan done work to wiphy work wifi: cfg80211: move sched scan stop to wiphy work ... ==================== Link: https://lore.kernel.org/r/87bkhohkbg.fsf@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10Merge branch 'tools-ynl-gen-code-gen-improvements-before-ethtool'Jakub Kicinski8-329/+186
Jakub Kicinski says: ==================== tools: ynl-gen: code gen improvements before ethtool I was going to post ethtool but I couldn't stand the ugliness of the if conditions which were previously generated. So I cleaned that up and improved a number of other things ethtool will benefit from. ==================== Link: https://lore.kernel.org/r/20230608211200.1247213-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>