summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)AuthorFilesLines
2024-04-03gve: add support to change ring size via ethtoolHarshitha Ramamurthy3-14/+95
Allow the user to change ring size via ethtool if supported by the device. The driver relies on the ring size ranges queried from device to validate ring sizes requested by the user. Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-03gve: add support to read ring size ranges from the deviceHarshitha Ramamurthy3-24/+102
Add support to read ring size change capability and the min and max descriptor counts from the device and store it in the driver. Also accommodate a special case where the device does not provide minimum ring size depending on the version of the device. In that case, rely on default values for the minimums. Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-03gve: set page count for RX QPL for GQI and DQO queue formatsHarshitha Ramamurthy5-22/+20
Fulfill the requirement that for GQI, the number of pages per RX QPL is equal to the ring size. Set this value to be equal to ring size. Because of this change, the rx_data_slot_cnt and rx_pages_per_qpl fields stored in the priv structure are not needed, so remove their usage. And for DQO, the number of pages per RX QPL is more than ring size to account for out-of-order completions. So set it to two times of rx ring size. Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-03gve: make the completion and buffer ring size equal for DQOHarshitha Ramamurthy5-43/+13
For the DQO queue format, the gve driver stores two ring sizes for both TX and RX - one for completion queue ring and one for data buffer ring. This is supposed to enable asymmetric sizes for these two rings but that is not supported. Make both fields reference the same single variable. This change renders reading supported TX completion ring size and RX buffer ring size for DQO from the device useless, so change those fields to reserved and remove related code. Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-03gve: simplify setting decriptor count defaultsHarshitha Ramamurthy1-29/+15
Combine the gve_set_desc_cnt and gve_set_desc_cnt_dqo into one function which sets the counts after checking the queue format. Both the functions in the previous code and the new combined function never return an error so make the new function void and remove the goto on error. Also rename the new function to gve_set_default_desc_cnt to be clearer about its intention. Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-03octeontx2-pf: Reset MAC stats during probeSai Krishna9-0/+100
Reset CGX/RPM MAC HW statistics at the time of driver probe() Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Signed-off-by: Sai Krishna <saikrishnag@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-03Merge branch '100GbE' of ↵Jakub Kicinski20-500/+814
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2024-04-01 (ice) This series contains updates to ice driver only. Michal Schmidt changes flow for gettimex64 to use host-side spinlock rather than hardware semaphore for lighter-weight locking. Steven adds ability for switch recipes to be re-used when firmware supports it. Thorsten Blum removes unwanted newlines in netlink messaging. Michal Swiatkowski and Piotr re-organize devlink related code; renaming, moving, and consolidating it to a single location. Michal also simplifies the devlink init and cleanup path to occur under a single lock call. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: ice: hold devlink lock for whole init/cleanup ice: move devlink port code to a separate file ice: move ice_devlink.[ch] to devlink folder ice: Remove newlines in NL_SET_ERR_MSG_MOD ice: Add switch recipe reusing feature ice: fold ice_ptp_read_time into ice_ptp_gettimex64 ice: avoid the PTP hardware semaphore in gettimex64 path ice: add ice_adapter for shared data across PFs on the same NIC ==================== Link: https://lore.kernel.org/r/20240401172421.1401696-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03net: team: use policy generated by YAML specHangbin Liu4-55/+98
generated with: $ ./tools/net/ynl/ynl-gen-c.py --mode kernel \ > --spec Documentation/netlink/specs/team.yaml --source \ > -o drivers/net/team/team_nl.c $ ./tools/net/ynl/ynl-gen-c.py --mode kernel \ > --spec Documentation/netlink/specs/team.yaml --header \ > -o drivers/net/team/team_nl.h The TEAM_ATTR_LIST_PORT in team_nl_policy is removed as it is only in the port list reply attributes. Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20240401031004.1159713-4-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03net: team: rename team to team_core for linkingHangbin Liu2-0/+1
Similar with commit 08d323234d10 ("net: fou: rename the source for linking"), We'll need to link two objects together to form the team module. This means the source can't be called team, the build system expects team.o to be the combined object. Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20240401031004.1159713-3-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03net/dpaa2: Avoid explicit cpumask var allocation on stackDawei Li1-5/+9
For CONFIG_CPUMASK_OFFSTACK=y kernel, explicit allocation of cpumask variable on stack is not recommended since it can cause potential stack overflow. Instead, kernel code should always use *cpumask_var API(s) to allocate cpumask var in config-neutral way, leaving allocation strategy to CONFIG_CPUMASK_OFFSTACK. Use *cpumask_var API(s) to address it. Signed-off-by: Dawei Li <dawei.li@shingroup.cn> Link: https://lore.kernel.org/r/20240331053441.1276826-3-dawei.li@shingroup.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03net: dsa: sja1105: drop driver owner assignmentKrzysztof Kozlowski1-1/+0
Core in spi_register_driver() already sets the .owner, so driver does not need to. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://lore.kernel.org/r/20240330211023.100924-2-krzysztof.kozlowski@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03net: dsa: microchip: drop driver owner assignmentKrzysztof Kozlowski1-1/+0
Core in spi_register_driver() already sets the .owner, so driver does not need to. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://lore.kernel.org/r/20240330211023.100924-1-krzysztof.kozlowski@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03nfp: Avoid -Wflex-array-member-not-at-end warningsGustavo A. R. Silva1-16/+25
-Wflex-array-member-not-at-end is coming in GCC-14, and we are getting ready to enable it globally. There is currently an object (`tl`), at the beginning of multiple structures, that contains a flexible structure (`struct nfp_dump_tl`), for example: struct nfp_dumpspec_csr { struct nfp_dump_tl tl; ... __be32 register_width; /* in bits */ }; So, in order to avoid ending up with flexible-array members in the middle of multiple other structs, we use the `struct_group_tagged()` helper to separate the flexible array from the rest of the members in the flexible structure: struct nfp_dump_tl { struct_group_tagged(nfp_dump_tl_hdr, hdr, ... the rest of members ); char data[]; }; With the change described above, we now declare objects of the type of the tagged struct, in this case `struct nfp_dump_tl_hdr`, without embedding flexible arrays in the middle of another struct: struct nfp_dumpspec_csr { struct nfp_dump_tl_hdr tl; ... __be32 register_width; /* in bits */ }; Also, use `container_of()` whenever we need to retrieve a pointer to the flexible structure, through which we can access the flexible array if needed. So, with these changes, fix 33 of the following warnings: drivers/net/ethernet/netronome/nfp/nfp_net_debugdump.c:58:28: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] drivers/net/ethernet/netronome/nfp/nfp_net_debugdump.c:64:28: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] drivers/net/ethernet/netronome/nfp/nfp_net_debugdump.c:70:28: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] drivers/net/ethernet/netronome/nfp/nfp_net_debugdump.c:78:28: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] drivers/net/ethernet/netronome/nfp/nfp_net_debugdump.c:87:28: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] drivers/net/ethernet/netronome/nfp/nfp_net_debugdump.c:92:28: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] Link: https://github.com/KSPP/linux/issues/202 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://lore.kernel.org/r/ZgYWlkxdrrieDYIu@neat Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03net: phy: aquantia: add support for AQR114C PHY IDPaweł Owoc1-0/+21
Add support for AQR114C PHY ID. This PHY advertise 10G speed: SPEED(0x04): 0x6031 capabilities: -400g +5g +2.5g -200g -25g -10g-xr -100g -40g -10g/1g -10 +100 +1000 -10-ts -2-tl +10g EXTABLE(0x0B): 0x40fc capabilities: -10g-cx4 -10g-lrm +10g-t +10g-kx4 +10g-kr +1000-t +1000-kx +100-tx -10-t -p2mp -40g/100g -1000/100-t1 -25g -200g/400g +2.5g/5g -1000-h but supports only up to 5G speed (as with AQR111/111B0). AQR111 init config is used to set max speed 5G. Signed-off-by: Paweł Owoc <frut3k7@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20240401145114.1699451-1-frut3k7@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-02genetlink: remove linux/genetlink.hJakub Kicinski1-1/+1
genetlink.h is a shell of what used to be a combined uAPI and kernel header over a decade ago. It has fewer than 10 lines of code. Merge it into net/genetlink.h. In some ways it'd be better to keep the combined header under linux/ but it would make looking through git history harder. Acked-by: Sven Eckelmann <sven@narfation.org> Link: https://lore.kernel.org/r/20240329175710.291749-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-02Merge branch '1GbE' of ↵Jakub Kicinski13-649/+582
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2024-03-29 (net: intel) This series contains updates to most Intel drivers. Jesse moves declaration of pci_driver struct to remove need for forward declarations in igb and converts Intel drivers to user newer power management ops. Sasha reworks power management flow on igc to avoid using rtnl_lock() during those flows. Maciej reorganizes i40e_nvm file to avoid forward declarations. * '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: i40e: avoid forward declarations in i40e_nvm.c igc: Refactor runtime power management flow net: intel: implement modern PM ops declarations igb: simplify pci ops declaration ==================== Link: https://lore.kernel.org/r/20240329175632.211340-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-01ice: hold devlink lock for whole init/cleanupMichal Swiatkowski2-20/+19
Simplify devlink lock code in driver by taking it for whole init/cleanup path. Instead of calling devlink functions that taking lock call the lockless versions. Suggested-by: Jiri Pirko <jiri@resnulli.us> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-04-01ice: move devlink port code to a separate filePiotr Raczynski7-424/+445
Keep devlink related code in a separate file. More devlink port code and handlers will be added here for other port operations. Remove no longer needed include of our devlink.h in ice_lib.c. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: Piotr Raczynski <piotr.raczynski@intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-04-01ice: move ice_devlink.[ch] to devlink folderMichal Swiatkowski8-7/+8
Only moving whole files, fixing Makefile and bunch of includes. Some changes to ice_devlink file was done even in representor part (Tx topology), so keep it as final patch to not mess up with rebasing. After moving to devlink folder there is no need to have such long name for these files. Rename them to simple devlink. Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-04-01ice: Remove newlines in NL_SET_ERR_MSG_MODThorsten Blum1-3/+3
Fixes Coccinelle/coccicheck warnings reported by newline_in_nl_msg.cocci. Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-04-01ice: Add switch recipe reusing featureSteven Zou5-17/+177
New E810 firmware supports the corresponding functionality, so the driver allows PFs to subscribe the same switch recipes. Then when the PF is done with a switch recipes, the PF can ask firmware to free that switch recipe. When users configure a rule to PFn into E810 switch component, if there is no existing recipe matching this rule's pattern, the driver will request firmware to allocate and return a new recipe resource for the rule by calling ice_add_sw_recipe() and ice_alloc_recipe(). If there is an existing recipe matching this rule's pattern with different key value, or this is a same second rule to PFm into switch component, the driver checks out this recipe by calling ice_find_recp(), the driver will tell firmware to share using this same recipe resource by calling ice_subscribable_recp_shared() and ice_subscribe_recipe(). When firmware detects that all subscribing PFs have freed the switch recipe, firmware will free the switch recipe so that it can be reused. This feature also fixes a problem where all switch recipes would eventually be exhausted because switch recipes could not be freed, as freeing a shared recipe could potentially break other PFs that were using it. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Andrii Staikov <andrii.staikov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Steven Zou <steven.zou@intel.com> Tested-by: Mayank Sharma <mayank.sharma@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-04-01ice: fold ice_ptp_read_time into ice_ptp_gettimex64Michal Schmidt1-22/+3
This is a cleanup. It is unnecessary to have this function just to call another function. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Reviewed-by: Sai Krishna <saikrishnag@marvell.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-04-01ice: avoid the PTP hardware semaphore in gettimex64 pathMichal Schmidt4-7/+12
The PTP hardware semaphore (PFTSYN_SEM) is used to synchronize operations that program the PTP timers. The operations involve issuing commands to the sideband queue. The E810 does not have a hardware sideband queue, so the admin queue is used. The admin queue is slow. I have observed delays in hundreds of milliseconds waiting for ice_sq_done. When phc2sys reads the time from the ice PTP clock and PFTSYN_SEM is held by a task performing one of the slow operations, ice_ptp_lock can easily time out. phc2sys gets -EBUSY and the kernel prints: ice 0000:XX:YY.0: PTP failed to get time These messages appear once every few seconds, causing log spam. The E810 datasheet recommends an algorithm for reading the upper 64 bits of the GLTSYN_TIME register. It matches what's implemented in ice_ptp_read_src_clk_reg. It is robust against wrap-around, but not necessarily against the concurrent setting of the register (with GLTSYN_CMD_{INIT,ADJ}_TIME commands). Perhaps that's why ice_ptp_gettimex64 also takes PFTSYN_SEM. The race with time setters can be prevented without relying on the PTP hardware semaphore. Using the "ice_adapter" from the previous patch, we can have a common spinlock for the PFs that share the clock hardware. It will protect the reading and writing to the GLTSYN_TIME register. The writing is performed indirectly, by the hardware, as a result of the driver writing GLTSYN_CMD_SYNC in ice_ptp_exec_tmr_cmd. I wasn't sure if the ice_flush there is enough to make sure GLTSYN_TIME has been updated, but it works well in my testing. My test code can be seen here: https://gitlab.com/mschmidt2/linux/-/commits/ice-ptp-host-side-lock-10 It consists of: - kernel threads reading the time in a busy loop and looking at the deltas between consecutive values, reporting new maxima. - a shell script that sets the time repeatedly; - a bpftrace probe to produce a histogram of the measured deltas. Without the spinlock ptp_gltsyn_time_lock, it is easy to see tearing. Deltas in the [2G, 4G) range appear in the histograms. With the spinlock added, there is no tearing and the biggest delta I saw was in the range [1M, 2M), that is under 2 ms. Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-04-01ice: add ice_adapter for shared data across PFs on the same NICMichal Schmidt5-1/+148
There is a need for synchronization between ice PFs on the same physical adapter. Add a "struct ice_adapter" for holding data shared between PFs of the same multifunction PCI device. The struct is refcounted - each ice_pf holds a reference to it. Its first use will be for PTP. I expect it will be useful also to improve the ugliness that is ice_prot_id_tbl. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-04-01ice: Add support for PFCP hardware offload in switchdevMarcin Szycik7-7/+169
Add support for creating PFCP filters in switchdev mode. Add support for parsing PFCP-specific tc options: S flag and SEID. To create a PFCP filter, a special netdev must be created and passed to tc command: ip link add pfcp0 type pfcp tc filter add dev eth0 ingress prio 1 flower pfcp_opts \ 1:123/ff:fffffffffffffff0 skip_hw action mirred egress redirect \ dev pfcp0 Changes in iproute2 [1] are required to be able to use pfcp_opts in tc. ICE COMMS package is required to create a filter as it contains PFCP profiles. Link: https://lore.kernel.org/netdev/20230614091758.11180-1-marcin.szycik@linux.intel.com [1] Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-01ice: refactor ICE_TC_FLWR_FIELD_ENC_OPTSMarcin Szycik2-6/+6
FLOW_DISSECTOR_KEY_ENC_OPTS can be used for multiple headers, but currently it is treated as GTP-exclusive in ice. Rename ICE_TC_FLWR_FIELD_ENC_OPTS to ICE_TC_FLWR_FIELD_GTP_OPTS and check for tunnel type earlier. After this refactor, it is easier to add new headers using FLOW_DISSECTOR_KEY_ENC_OPTS - instead of checking tunnel type in ice_tc_count_lkups() and ice_tc_fill_tunnel_outer(), it needs to be checked only once, in ice_parse_tunnel_attr(). Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-01pfcp: always set pfcp metadataMichal Swiatkowski1-1/+80
In PFCP receive path set metadata needed by flower code to do correct classification based on this metadata. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-01pfcp: add PFCP moduleWojciech Drewek3-0/+237
Packet Forwarding Control Protocol (PFCP) is a 3GPP Protocol used between the control plane and the user plane function. It is specified in TS 29.244[1]. Note that this module is not designed to support this Protocol in the kernel space. There is no support for parsing any PFCP messages. There is no API that could be used by any userspace daemon. Basically it does not support PFCP. This protocol is sophisticated and there is no need for implementing it in the kernel. The purpose of this module is to allow users to setup software and hardware offload of PFCP packets using tc tool. When user requests to create a PFCP device, a new socket is created. The socket is set up with port number 8805 which is specific for PFCP [29.244 4.2.2]. This allow to receive PFCP request messages, response messages use other ports. Note that only one PFCP netdev can be created. Only IPv4 is supported at this time. [1] https://portal.3gpp.org/desktopmodules/Specifications/SpecificationDetails.aspx?specificationId=3111 Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-01ip_tunnel: convert __be16 tunnel flags to bitmapsAlexander Lobakin12-65/+124
Historically, tunnel flags like TUNNEL_CSUM or TUNNEL_ERSPAN_OPT have been defined as __be16. Now all of those 16 bits are occupied and there's no more free space for new flags. It can't be simply switched to a bigger container with no adjustments to the values, since it's an explicit Endian storage, and on LE systems (__be16)0x0001 equals to (__be64)0x0001000000000000. We could probably define new 64-bit flags depending on the Endianness, i.e. (__be64)0x0001 on BE and (__be64)0x00010000... on LE, but that would introduce an Endianness dependency and spawn a ton of Sparse warnings. To mitigate them, all of those places which were adjusted with this change would be touched anyway, so why not define stuff properly if there's no choice. Define IP_TUNNEL_*_BIT counterparts as a bit number instead of the value already coded and a fistful of <16 <-> bitmap> converters and helpers. The two flags which have a different bit position are SIT_ISATAP_BIT and VTI_ISVTI_BIT, as they were defined not as __cpu_to_be16(), but as (__force __be16), i.e. had different positions on LE and BE. Now they both have strongly defined places. Change all __be16 fields which were used to store those flags, to IP_TUNNEL_DECLARE_FLAGS() -> DECLARE_BITMAP(__IP_TUNNEL_FLAG_NUM) -> unsigned long[1] for now, and replace all TUNNEL_* occurrences to their bitmap counterparts. Use the converters in the places which talk to the userspace, hardware (NFP) or other hosts (GRE header). The rest must explicitly use the new flags only. This must be done at once, otherwise there will be too many conversions throughout the code in the intermediate commits. Finally, disable the old __be16 flags for use in the kernel code (except for the two 'irregular' flags mentioned above), to prevent any accidental (mis)use of them. For the userspace, nothing is changed, only additions were made. Most noticeable bloat-o-meter difference (.text): vmlinux: 307/-1 (306) gre.ko: 62/0 (62) ip_gre.ko: 941/-217 (724) [*] ip_tunnel.ko: 390/-900 (-510) [**] ip_vti.ko: 138/0 (138) ip6_gre.ko: 534/-18 (516) [*] ip6_tunnel.ko: 118/-10 (108) [*] gre_flags_to_tnl_flags() grew, but still is inlined [**] ip_tunnel_find() got uninlined, hence such decrease The average code size increase in non-extreme case is 100-200 bytes per module, mostly due to sizeof(long) > sizeof(__be16), as %__IP_TUNNEL_FLAG_NUM is less than %BITS_PER_LONG and the compilers are able to expand the majority of bitmap_*() calls here into direct operations on scalars. Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-01ip_tunnel: use a separate struct to store tunnel params in the kernelAlexander Lobakin3-16/+20
Unlike IPv6 tunnels which use purely-kernel __ip6_tnl_parm structure to store params inside the kernel, IPv4 tunnel code uses the same ip_tunnel_parm which is being used to talk with the userspace. This makes it difficult to alter or add any fields or use a different format for whatever data. Define struct ip_tunnel_parm_kern, a 1:1 copy of ip_tunnel_parm for now, and use it throughout the code. Define the pieces, where the copy user <-> kernel happens, as standalone functions, and copy the data there field-by-field, so that the kernel-side structure could be easily modified later on and the users wouldn't have to care about this. Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-01bitmap: introduce generic optimized bitmap_size()Alexander Lobakin2-6/+1
The number of times yet another open coded `BITS_TO_LONGS(nbits) * sizeof(long)` can be spotted is huge. Some generic helper is long overdue. Add one, bitmap_size(), but with one detail. BITS_TO_LONGS() uses DIV_ROUND_UP(). The latter works well when both divident and divisor are compile-time constants or when the divisor is not a pow-of-2. When it is however, the compilers sometimes tend to generate suboptimal code (GCC 13): 48 83 c0 3f add $0x3f,%rax 48 c1 e8 06 shr $0x6,%rax 48 8d 14 c5 00 00 00 00 lea 0x0(,%rax,8),%rdx %BITS_PER_LONG is always a pow-2 (either 32 or 64), but GCC still does full division of `nbits + 63` by it and then multiplication by 8. Instead of BITS_TO_LONGS(), use ALIGN() and then divide by 8. GCC: 8d 50 3f lea 0x3f(%rax),%edx c1 ea 03 shr $0x3,%edx 81 e2 f8 ff ff 1f and $0x1ffffff8,%edx Now it shifts `nbits + 63` by 3 positions (IOW performs fast division by 8) and then masks bits[2:0]. bloat-o-meter: add/remove: 0/0 grow/shrink: 20/133 up/down: 156/-773 (-617) Clang does it better and generates the same code before/after starting from -O1, except that with the ALIGN() approach it uses %edx and thus still saves some bytes: add/remove: 0/0 grow/shrink: 9/133 up/down: 18/-538 (-520) Note that we can't expand DIV_ROUND_UP() by adding a check and using this approach there, as it's used in array declarations where expressions are not allowed. Add this helper to tools/ as well. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Acked-by: Yury Norov <yury.norov@gmail.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-01s390/cio: rename bitmap_size() -> idset_bitmap_size()Alexander Lobakin1-4/+6
bitmap_size() is a pretty generic name and one may want to use it for a generic bitmap API function. At the same time, its logic is not "generic", i.e. it's not just `nbits -> size of bitmap in bytes` converter as it would be expected from its name. Add the prefix 'idset_' used throughout the file where the function resides. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Acked-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-30netlink: introduce type-checking attribute iterationJohannes Berg7-35/+11
There are, especially with multi-attr arrays, many cases of needing to iterate all attributes of a specific type in a netlink message or a nested attribute. Add specific macros to support that case. Also convert many instances using this spatch: @@ iterator nla_for_each_attr; iterator name nla_for_each_attr_type; identifier nla; expression head, len, rem; expression ATTR; type T; identifier x; @@ -nla_for_each_attr(nla, head, len, rem) +nla_for_each_attr_type(nla, ATTR, head, len, rem) { <... T x; ...> -if (nla_type(nla) == ATTR) { ... -} } @@ identifier nla; iterator nla_for_each_nested; iterator name nla_for_each_nested_type; expression attr, rem; expression ATTR; type T; identifier x; @@ -nla_for_each_nested(nla, attr, rem) +nla_for_each_nested_type(nla, ATTR, attr, rem) { <... T x; ...> -if (nla_type(nla) == ATTR) { ... -} } @@ iterator nla_for_each_attr; iterator name nla_for_each_attr_type; identifier nla; expression head, len, rem; expression ATTR; type T; identifier x; @@ -nla_for_each_attr(nla, head, len, rem) +nla_for_each_attr_type(nla, ATTR, head, len, rem) { <... T x; ...> -if (nla_type(nla) != ATTR) continue; ... } @@ identifier nla; iterator nla_for_each_nested; iterator name nla_for_each_nested_type; expression attr, rem; expression ATTR; type T; identifier x; @@ -nla_for_each_nested(nla, attr, rem) +nla_for_each_nested_type(nla, ATTR, attr, rem) { <... T x; ...> -if (nla_type(nla) != ATTR) continue; ... } Although I had to undo one bad change this made, and I also adjusted some other code for whitespace and to use direct variable initialization now. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Link: https://lore.kernel.org/r/20240328203144.b5a6c895fb80.I1869b44767379f204998ff44dd239803f39c23e0@changeid Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-29mlx5: stop warning for 64KB pagesArnd Bergmann1-2/+4
When building with 64KB pages, clang points out that xsk->chunk_size can never be PAGE_SIZE: drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.c:19:22: error: result of comparison of constant 65536 with expression of type 'u16' (aka 'unsigned short') is always false [-Werror,-Wtautological-constant-out-of-range-compare] if (xsk->chunk_size > PAGE_SIZE || ~~~~~~~~~~~~~~~ ^ ~~~~~~~~~ In older versions of this code, using PAGE_SIZE was the only possibility, so this would have never worked on 64KB page kernels, but the patch apparently did not address this case completely. As Maxim Mikityanskiy suggested, 64KB chunks are really not all that useful, so just shut up the warning by adding a cast. Fixes: 282c0c798f8e ("net/mlx5e: Allow XSK frames smaller than a page") Link: https://lore.kernel.org/netdev/20211013150232.2942146-1-arnd@kernel.org/ Link: https://lore.kernel.org/lkml/a7b27541-0ebb-4f2d-bd06-270a4d404613@app.fastmail.com/ Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Maxim Mikityanskiy <maxtram95@gmail.com> Reviewed-by: Justin Stitt <justinstitt@google.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/r/20240328143051.1069575-9-arnd@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-29net: axienet: Fix kernel doc warningsSuraj Gupta2-5/+22
Add description of mdio enable, mdio disable and mdio wait functions. Add description of skb pointer in axidma_bd data structure. Remove 'phy_node' description in axienet local data structure since it is not a valid struct member. Correct description of struct axienet_option. Fix below kernel-doc warnings in drivers/net/ethernet/xilinx/: 1) xilinx_axienet_mdio.c:1: warning: no structured comments found 2) xilinx_axienet.h:379: warning: Function parameter or struct member 'skb' not described in 'axidma_bd' 3) xilinx_axienet.h:538: warning: Excess struct member 'phy_node' description in 'axienet_local' 4) xilinx_axienet.h:1002: warning: expecting prototype for struct axiethernet_option. Prototype was for struct axienet_option instead Signed-off-by: Suraj Gupta <suraj.gupta2@amd.com> Reviewed-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com> Link: https://lore.kernel.org/r/20240328110713.12885-1-suraj.gupta2@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-29octeontx2-pf: remove unused variables req_hdr and rsp_hdrSu Hui1-6/+2
Clang static checker(scan-buid): drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c:503:2: warning: Value stored to 'rsp_hdr' is never read [deadcode.DeadStores] Remove these unused variables to save some space. Signed-off-by: Su Hui <suhui@nfschina.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240328020723.4071539-1-suhui@nfschina.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-29nfc: st95hf: drop driver owner assignmentKrzysztof Kozlowski1-1/+0
Core in spi_register_driver() already sets the .owner, so driver does not need to. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20240327174810.519676-4-krzysztof.kozlowski@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-29nfc: mrvl: spi: drop driver owner assignmentKrzysztof Kozlowski1-1/+0
Core in spi_register_driver() already sets the .owner, so driver does not need to. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20240327174810.519676-3-krzysztof.kozlowski@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-29net: wwan: mhi: drop driver owner assignmentKrzysztof Kozlowski1-1/+0
Core in mhi_driver_register() already sets the .owner, so driver does not need to. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Acked-by: Sergey Ryazanov <ryazanov.s.a@gmail.com> Link: https://lore.kernel.org/r/20240327174810.519676-2-krzysztof.kozlowski@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-29net: microchip: encx24j600: drop driver owner assignmentKrzysztof Kozlowski1-1/+0
Core in spi_register_driver() already sets the .owner, so driver does not need to. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20240327174810.519676-1-krzysztof.kozlowski@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-29mlx5: avoid truncating error messageArnd Bergmann1-1/+1
clang warns that one error message is too long for its destination buffer: drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c:1876:4: error: 'snprintf' will always be truncated; specified size is 80, but format string expands to at least 94 [-Werror,-Wformat-truncation-non-kprintf] Reword it to be a bit shorter so it always fits. Fixes: 70f0302b3f20 ("net/mlx5: Bridge, implement mdb offload") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Subbaraya Sundeep <sbhatta@marvell.com> Link: https://lore.kernel.org/r/20240326223825.4084412-5-arnd@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-29qed: avoid truncating work queue lengthArnd Bergmann1-5/+4
clang complains that the temporary string for the name passed into alloc_workqueue() is too short for its contents: drivers/net/ethernet/qlogic/qed/qed_main.c:1218:3: error: 'snprintf' will always be truncated; specified size is 16, but format string expands to at least 18 [-Werror,-Wformat-truncation] There is no need for a temporary buffer, and the actual name of a workqueue is 32 bytes (WQ_NAME_LEN), so just use the interface as intended to avoid the truncation. Fixes: 59ccf86fe69a ("qed: Add driver infrastucture for handling mfw requests.") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20240326223825.4084412-4-arnd@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-29enetc: avoid truncating error messageArnd Bergmann1-1/+1
As clang points out, the error message in enetc_setup_xdp_prog() still does not fit in the buffer and will be truncated: drivers/net/ethernet/freescale/enetc/enetc.c:2771:3: error: 'snprintf' will always be truncated; specified size is 80, but format string expands to at least 87 [-Werror,-Wformat-truncation] Replace it with an even shorter message that should fit. Fixes: f968c56417f0 ("net: enetc: shorten enetc_setup_xdp_prog() error message to fit NETLINK_MAX_FMTMSG_LEN") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20240326223825.4084412-3-arnd@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-29octeontx2-af: Increase maximum BPID channelsRadha Mohan Chintakuntla1-4/+2
Any NIX interface type can have maximum 256 channels. So increased the backpressure ID count to 256 so that it can cover cn9k and cn10k SoCs that have different NIX interface types with varied maximum channels. Signed-off-by: Radha Mohan Chintakuntla <radhac@marvell.com> Link: https://lore.kernel.org/r/20240326184514.1628284-1-radhac@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-29nfc: st95hf: Switch to using gpiod APIAndy Shevchenko1-16/+11
This updates the driver to gpiod API, and removes yet another use of of_get_named_gpio(). Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20240326175836.1418718-1-andriy.shevchenko@linux.intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-29net: phy: air_en8811h: Add the Airoha EN8811H PHY driverEric Woudstra3-0/+1092
Add the driver for the Airoha EN8811H 2.5 Gigabit PHY. The phy supports 100/1000/2500 Mbps with auto negotiation only. The driver uses two firmware files, for which updated versions are added to linux-firmware already. Note: At phy-address + 8 there is another device on the mdio bus, that belongs to the EN881H. While the original driver writes to it, Airoha has confirmed this is not needed. Therefore, communication with this device is not included in this driver. Signed-off-by: Eric Woudstra <ericwouds@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20240326162305.303598-3-ericwouds@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-29i40e: avoid forward declarations in i40e_nvm.cMaciej Fijalkowski1-521/+489
Move code around to get rid of forward declarations. No functional changes. After a plain code juggling, checkpatch reported: total: 0 errors, 7 warnings, 12 checks, 1581 lines checked so while at it let's address old issues as well. Should we ever address the remaining unnecessary forward declarations within drivers/net/ethernet/intel/, consider this change as a starting point/reference. As reported in [0], there would be a lot more of work to do...if we care. [0]: https://lore.kernel.org/intel-wired-lan/Zeh8qadiTGf413YU@boxer/T/#u Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-29igc: Refactor runtime power management flowSasha Neftin1-17/+15
Following the corresponding discussion [1] and [2] refactor the 'igc_open' method and avoid taking the rtnl_lock() during the 'igc_resume' method. The rtnl_lock is held by the upper layer and could lead to the deadlock during resuming from a runtime power management flow. Notify the stack of the actual queue counts 'netif_set_real_num_*_queues' outside the '_igc_open' wrapper. This notification doesn't have to be called on each resume. Test: 1. Disconnect the ethernet cable 2. Enable the runtime power management via file system: echo auto > /sys/devices/pci0000\.../power/control 3. Check the device state (lspci -s <device> -vvv | grep -i Status) Link: https://lore.kernel.org/netdev/20231206113934.8d7819857574.I2deb5804 ef1739a2af307283d320ef7d82456494@changeid/#r [1] Link: https://lore.kernel.org/netdev/20211125074949.5f897431@kicinski-fedo ra-pc1c0hjn.dhcp.thefacebook.com/t/ [2] Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-29net: intel: implement modern PM ops declarationsJesse Brandeburg12-92/+64
Switch the Intel networking drivers to use the new power management ops declaration formats and macros, which allows us to drop __maybe_unused, as well as a bunch of ifdef checking CONFIG_PM. This is safe to do because the compiler drops the unused functions, verified by checking for any of the power management function symbols being present in System.map for a build without CONFIG_PM. If a driver has runtime PM, define the ops with pm_ptr(), and if the driver has Simple PM, use pm_sleep_ptr(), as well as the new versions of the macros for declaring the members of the pm_ops structs. Checked with network-enabled allnoconfig, allyesconfig, allmodconfig on x64_64. Reviewed-by: Alan Brady <alan.brady@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-29igb: simplify pci ops declarationJesse Brandeburg1-29/+24
The igb driver was pre-declaring tons of functions just so that it could have an early declaration of the pci_driver struct. Delete a bunch of the declarations and move the struct to the bottom of the file, after all the functions are declared. Reviewed-by: Alan Brady <alan.brady@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>