summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
AgeCommit message (Collapse)AuthorFilesLines
2024-04-17net/mlx5: Correctly compare pkt reformat idsCosmin Ratiu1-2/+12
[ Upstream commit 9eca93f4d5ab03905516a68683674d9c50ff95bd ] struct mlx5_pkt_reformat contains a naked union of a u32 id and a dr_action pointer which is used when the action is SW-managed (when pkt_reformat.owner is set to MLX5_FLOW_RESOURCE_OWNER_SW). Using id directly in that case is incorrect, as it maps to the least significant 32 bits of the 64-bit pointer in mlx5_fs_dr_action and not to the pkt reformat id allocated in firmware. For the purpose of comparing whether two rules are identical, interpreting the least significant 32 bits of the mlx5_fs_dr_action pointer as an id mostly works... until it breaks horribly and produces the outcome described in [1]. This patch fixes mlx5_flow_dests_cmp to correctly compare ids using mlx5_fs_dr_action_get_pkt_reformat_id for the SW-managed rules. Link: https://lore.kernel.org/netdev/ea5264d6-6b55-4449-a602-214c6f509c1e@163.com/T/#u [1] Fixes: 6a48faeeca10 ("net/mlx5: Add direct rule fs_cmd implementation") Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/r/20240409190820.227554-6-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-17net/mlx5: Properly link new fs rules into the treeCosmin Ratiu1-1/+2
[ Upstream commit 7c6782ad4911cbee874e85630226ed389ff2e453 ] Previously, add_rule_fg would only add newly created rules from the handle into the tree when they had a refcount of 1. On the other hand, create_flow_handle tries hard to find and reference already existing identical rules instead of creating new ones. These two behaviors can result in a situation where create_flow_handle 1) creates a new rule and references it, then 2) in a subsequent step during the same handle creation references it again, resulting in a rule with a refcount of 2 that is not linked into the tree, will have a NULL parent and root and will result in a crash when the flow group is deleted because del_sw_hw_rule, invoked on rule deletion, assumes node->parent is != NULL. This happened in the wild, due to another bug related to incorrect handling of duplicate pkt_reformat ids, which lead to the code in create_flow_handle incorrectly referencing a just-added rule in the same flow handle, resulting in the problem described above. Full details are at [1]. This patch changes add_rule_fg to add new rules without parents into the tree, properly initializing them and avoiding the crash. This makes it more consistent with how rules are added to an FTE in create_flow_handle. Fixes: 74491de93712 ("net/mlx5: Add multi dest support") Link: https://lore.kernel.org/netdev/ea5264d6-6b55-4449-a602-214c6f509c1e@163.com/T/#u [1] Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/r/20240409190820.227554-5-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-08-24Merge branch 'mlx5-next' of ↵Jakub Kicinski1-4/+33
https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Leon Romanovsky says: ==================== mlx5 MACsec RoCEv2 support From Patrisious: This series extends previously added MACsec offload support to cover RoCE traffic either. In order to achieve that, we need configure MACsec with offload between the two endpoints, like below: REMOTE_MAC=10:70:fd:43:71:c0 * ip addr add 1.1.1.1/16 dev eth2 * ip link set dev eth2 up * ip link add link eth2 macsec0 type macsec encrypt on * ip macsec offload macsec0 mac * ip macsec add macsec0 tx sa 0 pn 1 on key 00 dffafc8d7b9a43d5b9a3dfbbf6a30c16 * ip macsec add macsec0 rx port 1 address $REMOTE_MAC * ip macsec add macsec0 rx port 1 address $REMOTE_MAC sa 0 pn 1 on key 01 ead3664f508eb06c40ac7104cdae4ce5 * ip addr add 10.1.0.1/16 dev macsec0 * ip link set dev macsec0 up And in a similar manner on the other machine, while noting the keys order would be reversed and the MAC address of the other machine. RDMA traffic is separated through relevant GID entries and in case of IP ambiguity issue - meaning we have a physical GIDs and a MACsec GIDs with the same IP/GID, we disable our physical GID in order to force the user to only use the MACsec GID. v0: https://lore.kernel.org/netdev/20230813064703.574082-1-leon@kernel.org/ * 'mlx5-next' of https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux: RDMA/mlx5: Handles RoCE MACsec steering rules addition and deletion net/mlx5: Add RoCE MACsec steering infrastructure in core net/mlx5: Configure MACsec steering for ingress RoCEv2 traffic net/mlx5: Configure MACsec steering for egress RoCEv2 traffic IB/core: Reorder GID delete code for RoCE net/mlx5: Add MACsec priorities in RDMA namespaces RDMA/mlx5: Implement MACsec gid addition and deletion net/mlx5: Maintain fs_id xarray per MACsec device inside macsec steering net/mlx5: Remove netdevice from MACsec steering net/mlx5e: Move MACsec flow steering and statistics database from ethernet to core net/mlx5e: Rename MACsec flow steering functions/parameters to suit core naming style net/mlx5: Remove dependency of macsec flow steering on ethernet net/mlx5e: Move MACsec flow steering operations to be used as core library macsec: add functions to get macsec real netdevice and check offload ==================== Link: https://lore.kernel.org/r/20230821073833.59042-1-leon@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-20net/mlx5: Configure MACsec steering for ingress RoCEv2 trafficPatrisious Haddad1-1/+1
Add steering tables/rules to check if the decrypted traffic is RoCEv2, if so copy reg_b_metadata to a temp reg and forward it to RDMA_RX domain. The rules are added once the MACsec device is assigned an IP address where we verify that the packet ip is for MACsec device and that the temp reg has MACsec operation and a valid SCI inside. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2023-08-20net/mlx5: Add MACsec priorities in RDMA namespacesPatrisious Haddad1-3/+32
Add MACsec flow steering priorities in RDMA namespaces. This allows adding tables/rules to forward RoCEv2 traffic to the MACsec crypto tables in NIC_TX domain, and accept RoCEv2 traffic from NIC_RX domain. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2023-08-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-22/+85
Cross-merge networking fixes after downstream PR. Conflicts: net/dsa/port.c 9945c1fb03a3 ("net: dsa: fix older DSA drivers using phylink") a88dd7538461 ("net: dsa: remove legacy_pre_march2020 detection") https://lore.kernel.org/all/20230731102254.2c9868ca@canb.auug.org.au/ net/xdp/xsk.c 3c5b4d69c358 ("net: annotate data-races around sk->sk_mark") b7f72a30e9ac ("xsk: introduce wrappers and helpers for supporting multi-buffer in Tx path") https://lore.kernel.org/all/20230731102631.39988412@canb.auug.org.au/ drivers/net/ethernet/broadcom/bnxt/bnxt.c 37b61cda9c16 ("bnxt: don't handle XDP in netpoll") 2b56b3d99241 ("eth: bnxt: handle invalid Tx completions more gracefully") https://lore.kernel.org/all/20230801101708.1dc7faac@canb.auug.org.au/ Adjacent changes: drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c 62da08331f1a ("net/mlx5e: Set proper IPsec source port in L4 selector") fbd517549c32 ("net/mlx5e: Add function to get IPsec offload namespace") drivers/net/ethernet/sfc/selftest.c 55c1528f9b97 ("sfc: fix field-spanning memcpy in selftest") ae9d445cd41f ("sfc: Miscellaneous comment removals") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-03net/mlx5: fs_core: Skip the FTs in the same FS_TYPE_PRIO_CHAINS fs_prioJianbo Liu1-8/+72
In the cited commit, new type of FS_TYPE_PRIO_CHAINS fs_prio was added to support multiple parallel namespaces for multi-chains. And we skip all the flow tables under the fs_node of this type unconditionally, when searching for the next or previous flow table to connect for a new table. As this search function is also used for find new root table when the old one is being deleted, it will skip the entire FS_TYPE_PRIO_CHAINS fs_node next to the old root. However, new root table should be chosen from it if there is any table in it. Fix it by skipping only the flow tables in the same FS_TYPE_PRIO_CHAINS fs_node when finding the closest FT for a fs_node. Besides, complete the connecting from FTs of previous priority of prio because there should be multiple prevs after this fs_prio type is introduced. And also the next FT should be chosen from the first flow table next to the prio in the same FS_TYPE_PRIO_CHAINS fs_prio, if this prio is the first child. Fixes: 328edb499f99 ("net/mlx5: Split FDB fast path prio to multiple namespaces") Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Paul Blakey <paulb@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/7a95754df479e722038996c97c97b062b372591f.1690803944.git.leonro@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-03net/mlx5: fs_core: Make find_closest_ft more genericJianbo Liu1-15/+14
As find_closest_ft_recursive is called to find the closest FT, the first parameter of find_closest_ft can be changed from fs_prio to fs_node. Thus this function is extended to find the closest FT for the nodes of any type, not only prios, but also the sub namespaces. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/d3962c2b443ec8dde7a740dc742a1f052d5e256c.1690803944.git.leonro@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-03net/mlx5: Compare with old_dest param to modify rule destinationJianbo Liu1-1/+1
The rule destination must be compared with the old_dest passed in. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/24adc60d05d7492359ba343c6da1ebbe9fe284f6.1690802064.git.leon@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-03net/mlx5e: Support IPsec packet offload for TX in switchdev modeJianbo Liu1-0/+6
The IPsec encryption is done at the last, so add new prio for IPsec offload in FDB, and put it just lower than the slow path prio and higher than the per-vport prio. Three levels are added for TX. The first one is for ip xfrm policy. The sa table is created in the second level for ip xfrm state. The status table is created at the last to count the number of packets encrypted. The rules, which forward packets to uplink, are changed to forward them to IPsec TX tables first. These rules are restored after those tables are destroyed, which is done immediately when there is no reference to them, just as what does in legacy mode. The support for slow path is added here, by refreshing uplink's channels. But, the handling for TC fast path, which is more complicated, will be added later. Besides, reg c4 is used instead to match reqid. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/cfd0e6ffaf0b8c55ebaa9fb0649b7c504b6b8ec6.1690802064.git.leon@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-03net/mlx5e: Support IPsec packet offload for RX in switchdev modeJianbo Liu1-0/+6
As decryption must be done first, add new prio for IPsec offload in FDB, and put it just lower than BYPASS prio and higher than TC prio. Three levels are added for RX. The first one is for ip xfrm policy. SA table is created in the second level for ip xfrm state. The status table is created in the last to check the decryption result. If success, packets continue with the next process, or dropped otherwise. For now, the set of reg c1 is removed for swtichdev mode, and the datapath process will be added in the next patch. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/c91063554cf643fb50b99cf093e8a9bf11729de5.1690802064.git.leon@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-07-27net/mlx5: DR, Fix peer domain namespace settingShay Drory1-2/+2
The offending patch is based on the assumption that for PFs, mlx5_get_dev_index() is the same as vhca_id. However, this assumption is wrong in case of DPU (ECPF). Fix it by using vhca_id directly, and switch the array of peers to xarray. Fixes: 6d5b7321d8af ("net/mlx5: DR, handle more than one peer domain") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-06-02net/mlx5: DR, handle more than one peer domainShay Drory1-2/+3
Currently, DR domain is using the assumption that each domain can only have a single peer. In order to support VF LAG of more then two ports, expand peer domain to use an array of peers, and align the code accordingly. Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-04-12net/mlx5: Bridge, add per-port multicast replication tablesVlad Buslov1-1/+1
Multicast replication requires adding one more level of FDB_BR_OFFLOAD priority flow tables. The new level is used for per-port multicast-specific tables that have following flow groups structure (flow highest to lowest priority): - Flow group of size one that matches on source port metadata. This will have a static single rule that prevent packets from being replicated to their source port. - Flow group of size one that matches all packets and forwards them to the port that owns the table. Initialize the table dynamically on all bridge ports when adding a port to the bridge that has multicast enabled and on all existing bridge ports when receiving multicast enable notification. Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-03-20net/mlx5e: Use one rule to count all IPsec Tx offloaded trafficRaed Salem1-1/+1
Currently one counter is shared between all IPsec Tx offloaded rules to count the total amount of packets/bytes that was IPsec Tx offloaded, replace this scheme by adding a new flow table (ft) with one rule that counts all flows that passes through this table (like Rx status ft), this ft is pointed by all IPsec Tx offloaded rules. The above allows to have a counter per tx flow rule in while keeping a separate global counter that store the aggregation outcome of all these per flow counters. Signed-off-by: Raed Salem <raeds@nvidia.com> Link: https://lore.kernel.org/r/09b9119d1deb6e482fd2d17e1f5760d7c5be1e48.1678714336.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-03-20net/mlx5: fs_core: Allow ignore_flow_level on TX destPaul Blakey1-1/+2
ignore_flow_level is also supported by firmware on TX, remove this limitation. Signed-off-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Raed Salem <raeds@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/d0025722bfac0a82da758eb540fbf1ff3cacdf74.1678714336.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-02-16Merge branch 'mlx5-next' of ↵Jakub Kicinski1-5/+39
https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Leon Romanovsky says: ==================== mlx5-next changes Following previous conversations [1] and our clear commitment to do the TC work [2], please pull mlx5-next shared branch, which includes low-level steering logic to allow RoCEv2 traffic to be encrypted/ decrypted through IPsec. [1] https://lore.kernel.org/all/20230126230815.224239-1-saeed@kernel.org/ [2] https://lore.kernel.org/all/Y+Z7lVVWqnRBiPh2@nvidia.com/ * 'mlx5-next' of https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux: net/mlx5: Configure IPsec steering for egress RoCEv2 traffic net/mlx5: Configure IPsec steering for ingress RoCEv2 traffic net/mlx5: Add IPSec priorities in RDMA namespaces net/mlx5: Implement new destination type TABLE_TYPE net/mlx5: Introduce new destination type TABLE_TYPE ==================== Link: https://lore.kernel.org/r/20230215095624.1365200-1-leon@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-15net/mlx5: Configure IPsec steering for ingress RoCEv2 trafficMark Zhang1-2/+4
Add steering tables/rules to check if the decrypted traffic is RoCEv2, if so then forward it to RDMA_RX domain. Signed-off-by: Mark Zhang <markzhang@nvidia.com> Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-02-15net/mlx5: Add IPSec priorities in RDMA namespacesMark Zhang1-2/+33
Add IPSec flow steering priorities in RDMA namespaces. This allows adding tables/rules to forward RoCEv2 traffic to the IPSec crypto tables in NIC_TX domain, and accept RoCEv2 traffic from NIC_RX domain. Signed-off-by: Mark Zhang <markzhang@nvidia.com> Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-02-15net/mlx5: Implement new destination type TABLE_TYPEMark Zhang1-1/+2
Implement new destination type to support flow transition between different table types. e.g. from NIC_RX to RDMA_RX or from RDMA_TX to NIC_TX. The new destination is described in the tracepoint as follows: "mlx5_fs_add_rule: rule=00000000d53cd0ed fte=0000000048a8a6ed index=0 sw_action=<> [dst] flow_table_type=7 id:262152" Signed-off-by: Mark Zhang <markzhang@nvidia.com> Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-02-08net/mlx5: fs_core, Remove redundant variable errMaor Dickman1-2/+1
Local variable "err" is not used so it is safe to remove. Signed-off-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-01-27net/mlx5: Move flow steering devlink param to flow steering codeJiri Pirko1-1/+83
Move the param registration and handling code into the flow steering code as they are related to each other. No point in having the devlink param registration done in separate file. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-10Merge tag 'ipsec-next-2022-12-09' of ↵Jakub Kicinski1-3/+3
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== ipsec-next 2022-12-09 1) Add xfrm packet offload core API. From Leon Romanovsky. 2) Add xfrm packet offload support for mlx5. From Leon Romanovsky and Raed Salem. 3) Fix a typto in a error message. From Colin Ian King. * tag 'ipsec-next-2022-12-09' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next: (38 commits) xfrm: Fix spelling mistake "oflload" -> "offload" net/mlx5e: Open mlx5 driver to accept IPsec packet offload net/mlx5e: Handle ESN update events net/mlx5e: Handle hardware IPsec limits events net/mlx5e: Update IPsec soft and hard limits net/mlx5e: Store all XFRM SAs in Xarray net/mlx5e: Provide intermediate pointer to access IPsec struct net/mlx5e: Skip IPsec encryption for TX path without matching policy net/mlx5e: Add statistics for Rx/Tx IPsec offloaded flows net/mlx5e: Improve IPsec flow steering autogroup net/mlx5e: Configure IPsec packet offload flow steering net/mlx5e: Use same coding pattern for Rx and Tx flows net/mlx5e: Add XFRM policy offload logic net/mlx5e: Create IPsec policy offload tables net/mlx5e: Generalize creation of default IPsec miss group and rule net/mlx5e: Group IPsec miss handles into separate struct net/mlx5e: Make clear what IPsec rx_err does net/mlx5e: Flatten the IPsec RX add rule path net/mlx5e: Refactor FTE setup code to be more clear net/mlx5e: Move IPsec flow table creation to separate function ... ==================== Link: https://lore.kernel.org/r/20221209093310.4018731-1-steffen.klassert@secunet.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-09net/mlx5: fs, add match on ranges APIYevgeny Kliteynik1-2/+9
Range is a new flow destination type which allows matching on a range of values instead of matching on a specific value. Range flow destination has the following fields: - hit_ft: flow table to forward the traffic in case of hit - miss_ft: flow table to forward the traffic in case of miss - field: which packet characteristic to match on - min: minimal value for the selected field - max: maximal value for the selected field Note: - In order to match, the value in the packet should meet the following criteria: min <= value < max - Currently, the only supported field type is L2 packet length Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-12-09net/mlx5: fs, assert null dest pointer when dest_num is 0Oz Shlomo1-0/+3
Currently create_flow_handle() assumes a null dest pointer when there are no destinations. This might not be the case as the caller may pass an allocated dest array while setting the dest_num parameter to 0. Assert null dest array for flow rules that have no destinations (e.g. drop rule). Signed-off-by: Oz Shlomo <ozsh@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20221203221337.29267-3-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-08net/mlx5e: Create IPsec policy offload tablesLeon Romanovsky1-3/+3
Add empty table to be used for IPsec policy offload. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-09-07net/mlx5: Add MACsec Rx tables support to fs_coreLior Nahmanson1-2/+11
Add new namespace for MACsec RX flows. Encrypted MACsec packets should be first decrypted and stripped from MACsec header and then continues with the kernel's steering pipeline. Signed-off-by: Lior Nahmanson <liorna@nvidia.com> Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-07net/mlx5: Add MACsec Tx tables support to fs_coreLior Nahmanson1-4/+14
Changed EGRESS_KERNEL namespace to EGRESS_IPSEC and add new namespace for MACsec TX. This namespace should be the last namespace for transmitted packets. Signed-off-by: Lior Nahmanson <liorna@nvidia.com> Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-05Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds1-1/+7
Pull rdma updates from Jason Gunthorpe: "This cycle we got a new RDMA driver "ERDMA" for the Alibaba cloud environment. Otherwise the changes are dominated by rxe fixes. There is another RDMA driver on the list that might get merged next cycle, 'MANA' for the Azure cloud environment. Summary: - Bug fixes and small features for irdma, hns, siw, qedr, hfi1, mlx5 - General spelling/grammer fixes - rdma cm can follow changes in neighbours for control packets - Significant amounts of rxe fixes and spec compliance changes - Use the modern NAPI API - Use the bitmap API instead of open coding - Performance improvements for rtrs - Add the ERDMA driver for Alibaba cloud - Fix a use after free bug in SRP" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (99 commits) RDMA/ib_srpt: Unify checking rdma_cm_id condition in srpt_cm_req_recv() RDMA/rxe: Fix error unwind in rxe_create_qp() RDMA/mlx5: Add missing check for return value in get namespace flow RDMA/rxe: Split qp state for requester and completer RDMA/rxe: Generate error completion for error requester QP state RDMA/rxe: Update wqe_index for each wqe error completion RDMA/srpt: Fix a use-after-free RDMA/srpt: Introduce a reference count in struct srpt_device RDMA/srpt: Duplicate port name members IB/qib: Fix repeated "in" within comments RDMA/erdma: Add driver to kernel build environment RDMA/erdma: Add the ABI definitions RDMA/erdma: Add the erdma module RDMA/erdma: Add connection management (CM) support RDMA/erdma: Add verbs implementation RDMA/erdma: Add verbs header file RDMA/erdma: Add event queue implementation RDMA/erdma: Add cmdq implementation RDMA/erdma: Add main include file RDMA/erdma: Add the hardware related definitions ...
2022-07-17net/mlx5: fs, allow flow table creation with a UIDMark Bloch1-1/+1
Add UID field to flow table attributes to allow creating flow tables with a non default (zero) uid. Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-07-17net/mlx5: fs, expose flow table ID to usersMark Bloch1-0/+6
Expose the flow table ID to users. This will be used by downstream patches to allow creating steering rules that point to a flow table ID. Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-06-15Merge branch 'mlx5-next' of ↵Jakub Kicinski1-8/+10
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Saeed Mahameed says: ==================== mlx5-next: updates 2022-06-14 1) Updated HW bits and definitions for upcoming features 1.1) vport debug counters 1.2) flow meter 1.3) Execute ASO action for flow entry 1.4) enhanced CQE compression 2) Add ICM header-modify-pattern RDMA API Leon Says ========= SW steering manipulates packet's header using "modifying header" actions. Many of these actions do the same operation, but use different data each time. Currently we create and keep every one of these actions, which use expensive and limited resources. Now we introduce a new mechanism - pattern and argument, which splits a modifying action into two parts: 1. action pattern: contains the operations to be applied on packet's header, mainly set/add/copy of fields in the packet 2. action data/argument: contains the data to be used by each operation in the pattern. This way we reuse same patterns with different arguments to create new modifying actions, and since many actions share the same operations, we end up creating a small number of patterns that we keep in a dedicated cache. These modify header patterns are implemented as new type of ICM memory, so the following kernel patch series add the support for this new ICM type. ========== * 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux: net/mlx5: Add bits and fields to support enhanced CQE compression net/mlx5: Remove not used MLX5_CAP_BITS_RW_MASK net/mlx5: group fdb cleanup to single function net/mlx5: Add support EXECUTE_ASO action for flow entry net/mlx5: Add HW definitions of vport debug counters net/mlx5: Add IFC bits and enums for flow meter RDMA/mlx5: Support handling of modify-header pattern ICM area net/mlx5: Manage ICM of type modify-header pattern net/mlx5: Introduce header-modify-pattern ICM properties ==================== Link: https://lore.kernel.org/r/20220614184028.51548-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-14net/mlx5: group fdb cleanup to single functionShay Drory1-8/+10
Currently, the allocation of fdb software objects are done is single function, oppose to the cleanup of them. Group the cleanup of fdb software objects to single function. Signed-off-by: Shay Drory <shayd@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-06-08net/mlx5: fs, fail conflicting actionsMark Bloch1-3/+32
When combining two steering rules into one check not only do they share the same actions but those actions are also the same. This resolves an issue where when creating two different rules with the same match the actions are overwritten and one of the rules is deleted a FW syndrome can be seen in dmesg. mlx5_core 0000:03:00.0: mlx5_cmd_check:819:(pid 2105): DEALLOC_MODIFY_HEADER_CONTEXT(0x941) op_mod(0x0) failed, status bad resource state(0x9), syndrome (0x1ab444) Fixes: 0d235c3fabb7 ("net/mlx5: Add hash table to search FTEs in a flow-group") Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-31net/mlx5e: TC NIC mode, fix tc chains miss tableMaor Dickman1-1/+1
The cited commit changed promisc table to be created on demand with the highest priority in the NIC table replacing the vlan table, this caused tc NIC tables miss flow to skip the prmoisc table because it use vlan table as miss table. OVS offload in NIC mode use promisc by default so any unicast packet which will be handled by tc NIC tables miss flow will skip the promisc rule and will be dropped. Fix this by adding new empty table in new tc level with low priority and point the nic tc chain miss to it, the new table is managed so it will point to vlan table if promisc is disabled and to promisc table if enabled. Fixes: 1c46d7409f30 ("net/mlx5e: Optimize promiscuous mode") Signed-off-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Ariel Levkovich <lariel@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-56/+75
drivers/net/ethernet/mellanox/mlx5/core/main.c b33886971dbc ("net/mlx5: Initialize flow steering during driver probe") 40379a0084c2 ("net/mlx5_fpga: Drop INNOVA TLS support") f2b41b32cde8 ("net/mlx5: Remove ipsec_ops function table") https://lore.kernel.org/all/20220519040345.6yrjromcdistu7vh@sx1/ 16d42d313350 ("net/mlx5: Drain fw_reset when removing device") 8324a02c342a ("net/mlx5: Add exit route when waiting for FW") https://lore.kernel.org/all/20220519114119.060ce014@canb.auug.org.au/ tools/testing/selftests/net/mptcp/mptcp_join.sh e274f7154008 ("selftests: mptcp: add subflow limits test-cases") b6e074e171bc ("selftests: mptcp: add infinite map testcase") 5ac1d2d63451 ("selftests: mptcp: Add tests for userspace PM type") https://lore.kernel.org/all/20220516111918.366d747f@canb.auug.org.au/ net/mptcp/options.c ba2c89e0ea74 ("mptcp: fix checksum byte order") 1e39e5a32ad7 ("mptcp: infinite mapping sending") ea66758c1795 ("tcp: allow MPTCP to update the announced window") https://lore.kernel.org/all/20220519115146.751c3a37@canb.auug.org.au/ net/mptcp/pm.c 95d686517884 ("mptcp: fix subflow accounting on close") 4d25247d3ae4 ("mptcp: bypass in-kernel PM restrictions for non-kernel PMs") https://lore.kernel.org/all/20220516111435.72f35dca@canb.auug.org.au/ net/mptcp/subflow.c ae66fb2ba6c3 ("mptcp: Do TCP fallback on early DSS checksum failure") 0348c690ed37 ("mptcp: add the fallback check") f8d4bcacff3b ("mptcp: infinite mapping receiving") https://lore.kernel.org/all/20220519115837.380bb8d4@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-18net/mlx5: Initialize flow steering during driver probeShay Drory1-56/+75
Currently, software objects of flow steering are created and destroyed during reload flow. In case a device is unloaded, the following error is printed during grace period: mlx5_core 0000:00:0b.0: mlx5_fw_fatal_reporter_err_work:690:(pid 95): Driver is in error state. Unloading As a solution to fix use-after-free bugs, where we try to access these objects, when reading the value of flow_steering_mode devlink param[1], let's split flow steering creation and destruction into two routines: * init and cleanup: memory, cache, and pools allocation/free. * create and destroy: namespaces initialization and cleanup. While at it, re-order the cleanup function to mirror the init function. [1] Kasan trace: [ 385.119849 ] BUG: KASAN: use-after-free in mlx5_devlink_fs_mode_get+0x3b/0xa0 [ 385.119849 ] Read of size 4 at addr ffff888104b79308 by task bash/291 [ 385.119849 ] [ 385.119849 ] CPU: 1 PID: 291 Comm: bash Not tainted 5.17.0-rc1+ #2 [ 385.119849 ] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-2.fc32 04/01/2014 [ 385.119849 ] Call Trace: [ 385.119849 ] <TASK> [ 385.119849 ] dump_stack_lvl+0x6e/0x91 [ 385.119849 ] print_address_description.constprop.0+0x1f/0x160 [ 385.119849 ] ? mlx5_devlink_fs_mode_get+0x3b/0xa0 [ 385.119849 ] ? mlx5_devlink_fs_mode_get+0x3b/0xa0 [ 385.119849 ] kasan_report.cold+0x83/0xdf [ 385.119849 ] ? devlink_param_notify+0x20/0x190 [ 385.119849 ] ? mlx5_devlink_fs_mode_get+0x3b/0xa0 [ 385.119849 ] mlx5_devlink_fs_mode_get+0x3b/0xa0 [ 385.119849 ] devlink_nl_param_fill+0x18a/0xa50 [ 385.119849 ] ? _raw_spin_lock_irqsave+0x8d/0xe0 [ 385.119849 ] ? devlink_flash_update_timeout_notify+0xf0/0xf0 [ 385.119849 ] ? __wake_up_common+0x4b/0x1e0 [ 385.119849 ] ? preempt_count_sub+0x14/0xc0 [ 385.119849 ] ? _raw_spin_unlock_irqrestore+0x28/0x40 [ 385.119849 ] ? __wake_up_common_lock+0xe3/0x140 [ 385.119849 ] ? __wake_up_common+0x1e0/0x1e0 [ 385.119849 ] ? __sanitizer_cov_trace_const_cmp8+0x27/0x80 [ 385.119849 ] ? __rcu_read_unlock+0x48/0x70 [ 385.119849 ] ? kasan_unpoison+0x23/0x50 [ 385.119849 ] ? __kasan_slab_alloc+0x2c/0x80 [ 385.119849 ] ? memset+0x20/0x40 [ 385.119849 ] ? __sanitizer_cov_trace_const_cmp4+0x25/0x80 [ 385.119849 ] devlink_param_notify+0xce/0x190 [ 385.119849 ] devlink_unregister+0x92/0x2b0 [ 385.119849 ] remove_one+0x41/0x140 [ 385.119849 ] pci_device_remove+0x68/0x140 [ 385.119849 ] ? pcibios_free_irq+0x10/0x10 [ 385.119849 ] __device_release_driver+0x294/0x3f0 [ 385.119849 ] device_driver_detach+0x82/0x130 [ 385.119849 ] unbind_store+0x193/0x1b0 [ 385.119849 ] ? subsys_interface_unregister+0x270/0x270 [ 385.119849 ] drv_attr_store+0x4e/0x70 [ 385.119849 ] ? drv_attr_show+0x60/0x60 [ 385.119849 ] sysfs_kf_write+0xa7/0xc0 [ 385.119849 ] kernfs_fop_write_iter+0x23a/0x2f0 [ 385.119849 ] ? sysfs_kf_bin_read+0x160/0x160 [ 385.119849 ] new_sync_write+0x311/0x430 [ 385.119849 ] ? new_sync_read+0x480/0x480 [ 385.119849 ] ? _raw_spin_lock+0x87/0xe0 [ 385.119849 ] ? __sanitizer_cov_trace_cmp4+0x25/0x80 [ 385.119849 ] ? security_file_permission+0x94/0xa0 [ 385.119849 ] vfs_write+0x4c7/0x590 [ 385.119849 ] ksys_write+0xf6/0x1e0 [ 385.119849 ] ? __x64_sys_read+0x50/0x50 [ 385.119849 ] ? fpregs_assert_state_consistent+0x99/0xa0 [ 385.119849 ] do_syscall_64+0x3d/0x90 [ 385.119849 ] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 385.119849 ] RIP: 0033:0x7fc36ef38504 [ 385.119849 ] Code: 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00 00 00 48 8d 05 f9 61 0d 00 8b 00 85 c0 75 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 41 54 49 89 d4 55 48 89 f5 53 [ 385.119849 ] RSP: 002b:00007ffde0ff3d08 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 385.119849 ] RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00007fc36ef38504 [ 385.119849 ] RDX: 000000000000000c RSI: 00007fc370521040 RDI: 0000000000000001 [ 385.119849 ] RBP: 00007fc370521040 R08: 00007fc36f00b8c0 R09: 00007fc36ee4b740 [ 385.119849 ] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fc36f00a760 [ 385.119849 ] R13: 000000000000000c R14: 00007fc36f005760 R15: 000000000000000c [ 385.119849 ] </TASK> [ 385.119849 ] [ 385.119849 ] Allocated by task 65: [ 385.119849 ] kasan_save_stack+0x1e/0x40 [ 385.119849 ] __kasan_kmalloc+0x81/0xa0 [ 385.119849 ] mlx5_init_fs+0x11b/0x1160 [ 385.119849 ] mlx5_load+0x13c/0x220 [ 385.119849 ] mlx5_load_one+0xda/0x160 [ 385.119849 ] mlx5_recover_device+0xb8/0x100 [ 385.119849 ] mlx5_health_try_recover+0x2f9/0x3a1 [ 385.119849 ] devlink_health_reporter_recover+0x75/0x100 [ 385.119849 ] devlink_health_report+0x26c/0x4b0 [ 385.275909 ] mlx5_fw_fatal_reporter_err_work+0x11e/0x1b0 [ 385.275909 ] process_one_work+0x520/0x970 [ 385.275909 ] worker_thread+0x378/0x950 [ 385.275909 ] kthread+0x1bb/0x200 [ 385.275909 ] ret_from_fork+0x1f/0x30 [ 385.275909 ] [ 385.275909 ] Freed by task 65: [ 385.275909 ] kasan_save_stack+0x1e/0x40 [ 385.275909 ] kasan_set_track+0x21/0x30 [ 385.275909 ] kasan_set_free_info+0x20/0x30 [ 385.275909 ] __kasan_slab_free+0xfc/0x140 [ 385.275909 ] kfree+0xa5/0x3b0 [ 385.275909 ] mlx5_unload+0x2e/0xb0 [ 385.275909 ] mlx5_unload_one+0x86/0xb0 [ 385.275909 ] mlx5_fw_fatal_reporter_err_work.cold+0xca/0xcf [ 385.275909 ] process_one_work+0x520/0x970 [ 385.275909 ] worker_thread+0x378/0x950 [ 385.275909 ] kthread+0x1bb/0x200 [ 385.275909 ] ret_from_fork+0x1f/0x30 [ 385.275909 ] [ 385.275909 ] The buggy address belongs to the object at ffff888104b79300 [ 385.275909 ] which belongs to the cache kmalloc-128 of size 128 [ 385.275909 ] The buggy address is located 8 bytes inside of [ 385.275909 ] 128-byte region [ffff888104b79300, ffff888104b79380) [ 385.275909 ] The buggy address belongs to the page: [ 385.275909 ] page:00000000de44dd39 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x104b78 [ 385.275909 ] head:00000000de44dd39 order:1 compound_mapcount:0 [ 385.275909 ] flags: 0x8000000000010200(slab|head|zone=2) [ 385.275909 ] raw: 8000000000010200 0000000000000000 dead000000000122 ffff8881000428c0 [ 385.275909 ] raw: 0000000000000000 0000000080200020 00000001ffffffff 0000000000000000 [ 385.275909 ] page dumped because: kasan: bad access detected [ 385.275909 ] [ 385.275909 ] Memory state around the buggy address: [ 385.275909 ] ffff888104b79200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fc fc [ 385.275909 ] ffff888104b79280: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 385.275909 ] >ffff888104b79300: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 385.275909 ] ^ [ 385.275909 ] ffff888104b79380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 385.275909 ] ffff888104b79400: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 385.275909 ]] Fixes: e890acd5ff18 ("net/mlx5: Add devlink flow_steering_mode parameter") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-03net/mlx5: fs, an FTE should have no dests when deletedMark Bloch1-0/+1
When deleting an FTE it should have no dests, which means fte->dests_size should be 0. Add a WARN_ON() to catch bugs where the proper tracking wasn't done. Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-03net/mlx5: fs, call the deletion function of the nodeMark Bloch1-1/+1
Don't call del_hw_fte() directly, instead use the hardware deletion function set. This is just a small cleanup and doesn't change anything as for an FTE the deletion function is already set to del_hw_fte(). Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-03net/mlx5: fs, delete the FTE when there are no rules attached to itMark Bloch1-5/+5
When an FTE has no children is means all the rules where removed and the FTE can be deleted regardless of the dests_size value. While dests_size should be 0 when there are no children be extra careful not to leak memory or get firmware syndrome if the proper bookkeeping of dests_size wasn't done. Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-03net/mlx5: fs, do proper bookkeeping for forward destinationsMark Bloch1-1/+19
Keep track after destinations that are forward destinations. When a forward destinations is removed from an FTE check if the actions bits need to be updated. Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-03net/mlx5: fs, add unused destination typeMark Bloch1-0/+2
When the caller doesn't pass a destination fs_core will create a unused rule just so a context can be returned. This unused rule is zeroed out and its type is 0 which can be mixed up with MLX5_FLOW_DESTINATION_TYPE_VPORT. Create a dedicated type to differentiate between the two named MLX5_FLOW_DESTINATION_TYPE_NONE. Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-03net/mlx5: fs, jump to exit point and don't fall throughMark Bloch1-0/+1
For code clarity and to prevent future bugs make sure to jump to the exit point once done handling that specific type. This aligns the code with the rest logic in the function. Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-03net/mlx5: fs, refactor software deletion ruleMark Bloch1-6/+6
When deleting a rule make sure that for every type dests_size is decreased only once and no other logic is executed. Without this dests_size might be decreased twice when dests_size == 1 so the if for that type won't be entered and if action has MLX5_FLOW_CONTEXT_ACTION_FWD_DEST set dests_size will be decreased again. Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-04-09net/mlx5: Align flow steering allocation namespace to common styleLeon Romanovsky1-6/+0
Flow steering is a low level internal driver API, as such it relies on the callers to check if namespace is supported and not rely on some compilation flag. Link: https://lore.kernel.org/r/cfb411a8a9ed2a1471810af254bdc0f03469f79c.1649232994.git.leonro@nvidia.com Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2022-04-09net/mlx5_fpga: Drop INNOVA IPsec supportLeon Romanovsky1-8/+1
Mellanox INNOVA IPsec cards are EOL in Nov, 2019 [1]. As such, the code is unmaintained, untested and not in-use by any upstream/distro oriented customers. In order to reduce code complexity, drop the kernel code. [1] https://network.nvidia.com/related-docs/eol/LCR-000535.pdf Link: https://lore.kernel.org/r/2afe88ec5020a491079eacf6fe3c89b64d65195c.1649232994.git.leonro@nvidia.com Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2022-03-01Merge branch 'mlx5-next' of ↵Jakub Kicinski1-1/+8
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Saeed Mahameed says: ==================== mlx5-next 2022-22-02 The following PR includes updates to mlx5-next branch: Headlines: ========== 1) Jakub cleans up unused static inline functions 2) I did some low level firmware command interface return status changes to provide the caller with full visibility on the error/status returned by the Firmware. 3) Use the new command interface in RDMA DEVX usecases to avoid flooding dmesg with some "expected" user error prone use cases. 4) Moshe also uses the new command interface to grab the specific error code from MFRL register command to provide the exact error reason for why SW reset couldn't perform internally in FW. 5) From Mark Bloch: Lag, drop packets in hardware when possible In active-backup mode the inactive interface's packets are dropped by the bond device. In switchdev where TC rules are offloaded to the FDB this can lead to packets being hit in the FDB where without offload they would have been dropped before reaching TC rules in the kernel. Create a drop rule to make sure packets on inactive ports are dropped before reaching the FDB. Listen on NETDEV_CHANGEUPPER / NETDEV_CHANGEINFODATA events and record the inactive state and offload accordingly. * 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux: net/mlx5: Add clarification on sync reset failure net/mlx5: Add reset_state field to MFRL register RDMA/mlx5: Use new command interface API net/mlx5: cmdif, Refactor error handling and reporting of async commands net/mlx5: Use mlx5_cmd_do() in core create_{cq,dct} net/mlx5: cmdif, Add new api for command execution net/mlx5: cmdif, cmd_check refactoring net/mlx5: cmdif, Return value improvements net/mlx5: Lag, offload active-backup drops to hardware net/mlx5: Lag, record inactive state of bond device net/mlx5: Lag, don't use magic numbers for ports net/mlx5: Lag, use local variable already defined to access E-Switch net/mlx5: E-switch, add drop rule support to ingress ACL net/mlx5: E-switch, remove special uplink ingress ACL handling net/mlx5: E-Switch, reserve and use same uplink metadata across ports net/mlx5: Add ability to insert to specific flow group mlx5: remove unused static inlines ==================== Link: https://lore.kernel.org/r/20220223233930.319301-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-0/+2
tools/testing/selftests/net/mptcp/mptcp_join.sh 34aa6e3bccd8 ("selftests: mptcp: add ip mptcp wrappers") 857898eb4b28 ("selftests: mptcp: add missing join check") 6ef84b1517e0 ("selftests: mptcp: more robust signal race test") https://lore.kernel.org/all/20220221131842.468893-1-broonie@kernel.org/ drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/act.h drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/ct.c fb7e76ea3f3b6 ("net/mlx5e: TC, Skip redundant ct clear actions") c63741b426e11 ("net/mlx5e: Fix MPLSoUDP encap to use MPLS action information") 09bf97923224f ("net/mlx5e: TC, Move pedit_headers_action to parse_attr") 84ba8062e383 ("net/mlx5e: Test CT and SAMPLE on flow attr") efe6f961cd2e ("net/mlx5e: CT, Don't set flow flag CT for ct clear flow") 3b49a7edec1d ("net/mlx5e: TC, Reject rules with multiple CT actions") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24net/mlx5: Fix possible deadlock on rule deletionMaor Gottlieb1-0/+2
Add missing call to up_write_ref_node() which releases the semaphore in case the FTE doesn't have destinations, such in drop rule case. Fixes: 465e7baab6d9 ("net/mlx5: Fix deletion of duplicate rules") Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-24net/mlx5: Add ability to insert to specific flow groupMark Bloch1-1/+8
If the flow table isn't an autogroup the upper driver has to create the flow groups explicitly. This information can't later be used when creating rules to insert into a specific flow group. Allow such use case. Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>