diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2022-12-14 02:47:48 +0300 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2022-12-14 02:47:48 +0300 |
commit | 7e68dd7d07a28faa2e6574dd6b9dbd90cdeaae91 (patch) | |
tree | ae0427c5a3b905f24b3a44b510a9bcf35d9b67a3 /drivers/net/ethernet/mellanox/mlx5/core/steering/dr_rule.c | |
parent | 1ca06f1c1acecbe02124f14a37cce347b8c1a90c (diff) | |
parent | 7c4a6309e27f411743817fe74a832ec2d2798a4b (diff) | |
download | linux-7e68dd7d07a28faa2e6574dd6b9dbd90cdeaae91.tar.xz |
Merge tag 'net-next-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Paolo Abeni:
"Core:
- Allow live renaming when an interface is up
- Add retpoline wrappers for tc, improving considerably the
performances of complex queue discipline configurations
- Add inet drop monitor support
- A few GRO performance improvements
- Add infrastructure for atomic dev stats, addressing long standing
data races
- De-duplicate common code between OVS and conntrack offloading
infrastructure
- A bunch of UBSAN_BOUNDS/FORTIFY_SOURCE improvements
- Netfilter: introduce packet parser for tunneled packets
- Replace IPVS timer-based estimators with kthreads to scale up the
workload with the number of available CPUs
- Add the helper support for connection-tracking OVS offload
BPF:
- Support for user defined BPF objects: the use case is to allocate
own objects, build own object hierarchies and use the building
blocks to build own data structures flexibly, for example, linked
lists in BPF
- Make cgroup local storage available to non-cgroup attached BPF
programs
- Avoid unnecessary deadlock detection and failures wrt BPF task
storage helpers
- A relevant bunch of BPF verifier fixes and improvements
- Veristat tool improvements to support custom filtering, sorting,
and replay of results
- Add LLVM disassembler as default library for dumping JITed code
- Lots of new BPF documentation for various BPF maps
- Add bpf_rcu_read_{,un}lock() support for sleepable programs
- Add RCU grace period chaining to BPF to wait for the completion of
access from both sleepable and non-sleepable BPF programs
- Add support storing struct task_struct objects as kptrs in maps
- Improve helper UAPI by explicitly defining BPF_FUNC_xxx integer
values
- Add libbpf *_opts API-variants for bpf_*_get_fd_by_id() functions
Protocols:
- TCP: implement Protective Load Balancing across switch links
- TCP: allow dynamically disabling TCP-MD5 static key, reverting back
to fast[er]-path
- UDP: Introduce optional per-netns hash lookup table
- IPv6: simplify and cleanup sockets disposal
- Netlink: support different type policies for each generic netlink
operation
- MPTCP: add MSG_FASTOPEN and FastOpen listener side support
- MPTCP: add netlink notification support for listener sockets events
- SCTP: add VRF support, allowing sctp sockets binding to VRF devices
- Add bridging MAC Authentication Bypass (MAB) support
- Extensions for Ethernet VPN bridging implementation to better
support multicast scenarios
- More work for Wi-Fi 7 support, comprising conversion of all the
existing drivers to internal TX queue usage
- IPSec: introduce a new offload type (packet offload) allowing
complete header processing and crypto offloading
- IPSec: extended ack support for more descriptive XFRM error
reporting
- RXRPC: increase SACK table size and move processing into a
per-local endpoint kernel thread, reducing considerably the
required locking
- IEEE 802154: synchronous send frame and extended filtering support,
initial support for scanning available 15.4 networks
- Tun: bump the link speed from 10Mbps to 10Gbps
- Tun/VirtioNet: implement UDP segmentation offload support
Driver API:
- PHY/SFP: improve power level switching between standard level 1 and
the higher power levels
- New API for netdev <-> devlink_port linkage
- PTP: convert existing drivers to new frequency adjustment
implementation
- DSA: add support for rx offloading
- Autoload DSA tagging driver when dynamically changing protocol
- Add new PCP and APPTRUST attributes to Data Center Bridging
- Add configuration support for 800Gbps link speed
- Add devlink port function attribute to enable/disable RoCE and
migratable
- Extend devlink-rate to support strict prioriry and weighted fair
queuing
- Add devlink support to directly reading from region memory
- New device tree helper to fetch MAC address from nvmem
- New big TCP helper to simplify temporary header stripping
New hardware / drivers:
- Ethernet:
- Marvel Octeon CNF95N and CN10KB Ethernet Switches
- Marvel Prestera AC5X Ethernet Switch
- WangXun 10 Gigabit NIC
- Motorcomm yt8521 Gigabit Ethernet
- Microchip ksz9563 Gigabit Ethernet Switch
- Microsoft Azure Network Adapter
- Linux Automation 10Base-T1L adapter
- PHY:
- Aquantia AQR112 and AQR412
- Motorcomm YT8531S
- PTP:
- Orolia ART-CARD
- WiFi:
- MediaTek Wi-Fi 7 (802.11be) devices
- RealTek rtw8821cu, rtw8822bu, rtw8822cu and rtw8723du USB
devices
- Bluetooth:
- Broadcom BCM4377/4378/4387 Bluetooth chipsets
- Realtek RTL8852BE and RTL8723DS
- Cypress.CYW4373A0 WiFi + Bluetooth combo device
Drivers:
- CAN:
- gs_usb: bus error reporting support
- kvaser_usb: listen only and bus error reporting support
- Ethernet NICs:
- Intel (100G):
- extend action skbedit to RX queue mapping
- implement devlink-rate support
- support direct read from memory
- nVidia/Mellanox (mlx5):
- SW steering improvements, increasing rules update rate
- Support for enhanced events compression
- extend H/W offload packet manipulation capabilities
- implement IPSec packet offload mode
- nVidia/Mellanox (mlx4):
- better big TCP support
- Netronome Ethernet NICs (nfp):
- IPsec offload support
- add support for multicast filter
- Broadcom:
- RSS and PTP support improvements
- AMD/SolarFlare:
- netlink extened ack improvements
- add basic flower matches to offload, and related stats
- Virtual NICs:
- ibmvnic: introduce affinity hint support
- small / embedded:
- FreeScale fec: add initial XDP support
- Marvel mv643xx_eth: support MII/GMII/RGMII modes for Kirkwood
- TI am65-cpsw: add suspend/resume support
- Mediatek MT7986: add RX wireless wthernet dispatch support
- Realtek 8169: enable GRO software interrupt coalescing per
default
- Ethernet high-speed switches:
- Microchip (sparx5):
- add support for Sparx5 TC/flower H/W offload via VCAP
- Mellanox mlxsw:
- add 802.1X and MAC Authentication Bypass offload support
- add ip6gre support
- Embedded Ethernet switches:
- Mediatek (mtk_eth_soc):
- improve PCS implementation, add DSA untag support
- enable flow offload support
- Renesas:
- add rswitch R-Car Gen4 gPTP support
- Microchip (lan966x):
- add full XDP support
- add TC H/W offload via VCAP
- enable PTP on bridge interfaces
- Microchip (ksz8):
- add MTU support for KSZ8 series
- Qualcomm 802.11ax WiFi (ath11k):
- support configuring channel dwell time during scan
- MediaTek WiFi (mt76):
- enable Wireless Ethernet Dispatch (WED) offload support
- add ack signal support
- enable coredump support
- remain_on_channel support
- Intel WiFi (iwlwifi):
- enable Wi-Fi 7 Extremely High Throughput (EHT) PHY capabilities
- 320 MHz channels support
- RealTek WiFi (rtw89):
- new dynamic header firmware format support
- wake-over-WLAN support"
* tag 'net-next-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2002 commits)
ipvs: fix type warning in do_div() on 32 bit
net: lan966x: Remove a useless test in lan966x_ptp_add_trap()
net: ipa: add IPA v4.7 support
dt-bindings: net: qcom,ipa: Add SM6350 compatible
bnxt: Use generic HBH removal helper in tx path
IPv6/GRO: generic helper to remove temporary HBH/jumbo header in driver
selftests: forwarding: Add bridge MDB test
selftests: forwarding: Rename bridge_mdb test
bridge: mcast: Support replacement of MDB port group entries
bridge: mcast: Allow user space to specify MDB entry routing protocol
bridge: mcast: Allow user space to add (*, G) with a source list and filter mode
bridge: mcast: Add support for (*, G) with a source list and filter mode
bridge: mcast: Avoid arming group timer when (S, G) corresponds to a source
bridge: mcast: Add a flag for user installed source entries
bridge: mcast: Expose __br_multicast_del_group_src()
bridge: mcast: Expose br_multicast_new_group_src()
bridge: mcast: Add a centralized error path
bridge: mcast: Place netlink policy before validation functions
bridge: mcast: Split (*, G) and (S, G) addition into different functions
bridge: mcast: Do not derive entry type from its filter mode
...
Diffstat (limited to 'drivers/net/ethernet/mellanox/mlx5/core/steering/dr_rule.c')
-rw-r--r-- | drivers/net/ethernet/mellanox/mlx5/core/steering/dr_rule.c | 119 |
1 files changed, 77 insertions, 42 deletions
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_rule.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_rule.c index 91ff19f67695..74cbe53ee9db 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_rule.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_rule.c @@ -3,13 +3,16 @@ #include "dr_types.h" -#define DR_RULE_MAX_STE_CHAIN (DR_RULE_MAX_STES + DR_ACTION_MAX_STES) +#define DR_RULE_MAX_STES_OPTIMIZED 5 +#define DR_RULE_MAX_STE_CHAIN_OPTIMIZED (DR_RULE_MAX_STES_OPTIMIZED + DR_ACTION_MAX_STES) -static int dr_rule_append_to_miss_list(struct mlx5dr_ste_ctx *ste_ctx, +static int dr_rule_append_to_miss_list(struct mlx5dr_domain *dmn, + enum mlx5dr_domain_nic_type nic_type, struct mlx5dr_ste *new_last_ste, struct list_head *miss_list, struct list_head *send_list) { + struct mlx5dr_ste_ctx *ste_ctx = dmn->ste_ctx; struct mlx5dr_ste_send_info *ste_info_last; struct mlx5dr_ste *last_ste; @@ -17,7 +20,7 @@ static int dr_rule_append_to_miss_list(struct mlx5dr_ste_ctx *ste_ctx, last_ste = list_last_entry(miss_list, struct mlx5dr_ste, miss_list_node); WARN_ON(!last_ste); - ste_info_last = kzalloc(sizeof(*ste_info_last), GFP_KERNEL); + ste_info_last = mlx5dr_send_info_alloc(dmn, nic_type); if (!ste_info_last) return -ENOMEM; @@ -32,16 +35,28 @@ static int dr_rule_append_to_miss_list(struct mlx5dr_ste_ctx *ste_ctx, return 0; } +static void dr_rule_set_last_ste_miss_addr(struct mlx5dr_matcher *matcher, + struct mlx5dr_matcher_rx_tx *nic_matcher, + u8 *hw_ste) +{ + struct mlx5dr_ste_ctx *ste_ctx = matcher->tbl->dmn->ste_ctx; + u64 icm_addr; + + if (mlx5dr_ste_is_miss_addr_set(ste_ctx, hw_ste)) + return; + + icm_addr = mlx5dr_icm_pool_get_chunk_icm_addr(nic_matcher->e_anchor->chunk); + mlx5dr_ste_set_miss_addr(ste_ctx, hw_ste, icm_addr); +} + static struct mlx5dr_ste * dr_rule_create_collision_htbl(struct mlx5dr_matcher *matcher, struct mlx5dr_matcher_rx_tx *nic_matcher, u8 *hw_ste) { struct mlx5dr_domain *dmn = matcher->tbl->dmn; - struct mlx5dr_ste_ctx *ste_ctx = dmn->ste_ctx; struct mlx5dr_ste_htbl *new_htbl; struct mlx5dr_ste *ste; - u64 icm_addr; /* Create new table for miss entry */ new_htbl = mlx5dr_ste_htbl_alloc(dmn->ste_icm_pool, @@ -55,8 +70,7 @@ dr_rule_create_collision_htbl(struct mlx5dr_matcher *matcher, /* One and only entry, never grows */ ste = new_htbl->chunk->ste_arr; - icm_addr = mlx5dr_icm_pool_get_chunk_icm_addr(nic_matcher->e_anchor->chunk); - mlx5dr_ste_set_miss_addr(ste_ctx, hw_ste, icm_addr); + dr_rule_set_last_ste_miss_addr(matcher, nic_matcher, hw_ste); mlx5dr_htbl_get(new_htbl); return ste; @@ -120,7 +134,7 @@ dr_rule_handle_one_ste_in_update_list(struct mlx5dr_ste_send_info *ste_info, goto out; out: - kfree(ste_info); + mlx5dr_send_info_free(ste_info); return ret; } @@ -191,8 +205,8 @@ dr_rule_rehash_handle_collision(struct mlx5dr_matcher *matcher, new_ste->htbl->chunk->miss_list = mlx5dr_ste_get_miss_list(col_ste); /* Update the previous from the list */ - ret = dr_rule_append_to_miss_list(dmn->ste_ctx, new_ste, - mlx5dr_ste_get_miss_list(col_ste), + ret = dr_rule_append_to_miss_list(dmn, nic_matcher->nic_tbl->nic_dmn->type, + new_ste, mlx5dr_ste_get_miss_list(col_ste), update_list); if (ret) { mlx5dr_dbg(dmn, "Failed update dup entry\n"); @@ -238,7 +252,6 @@ dr_rule_rehash_copy_ste(struct mlx5dr_matcher *matcher, bool use_update_list = false; u8 hw_ste[DR_STE_SIZE] = {}; struct mlx5dr_ste *new_ste; - u64 icm_addr; int new_idx; u8 sb_idx; @@ -247,9 +260,8 @@ dr_rule_rehash_copy_ste(struct mlx5dr_matcher *matcher, mlx5dr_ste_set_bit_mask(hw_ste, nic_matcher->ste_builder[sb_idx].bit_mask); /* Copy STE control and tag */ - icm_addr = mlx5dr_icm_pool_get_chunk_icm_addr(nic_matcher->e_anchor->chunk); memcpy(hw_ste, mlx5dr_ste_get_hw_ste(cur_ste), DR_STE_SIZE_REDUCED); - mlx5dr_ste_set_miss_addr(dmn->ste_ctx, hw_ste, icm_addr); + dr_rule_set_last_ste_miss_addr(matcher, nic_matcher, hw_ste); new_idx = mlx5dr_ste_calc_hash_index(hw_ste, new_htbl); new_ste = &new_htbl->chunk->ste_arr[new_idx]; @@ -278,7 +290,8 @@ dr_rule_rehash_copy_ste(struct mlx5dr_matcher *matcher, new_htbl->ctrl.num_of_valid_entries++; if (use_update_list) { - ste_info = kzalloc(sizeof(*ste_info), GFP_KERNEL); + ste_info = mlx5dr_send_info_alloc(dmn, + nic_matcher->nic_tbl->nic_dmn->type); if (!ste_info) goto err_exit; @@ -357,6 +370,15 @@ static int dr_rule_rehash_copy_htbl(struct mlx5dr_matcher *matcher, update_list); if (err) goto clean_copy; + + /* In order to decrease the number of allocated ste_send_info + * structs, send the current table row now. + */ + err = dr_rule_send_update_list(update_list, matcher->tbl->dmn, false); + if (err) { + mlx5dr_dbg(matcher->tbl->dmn, "Failed updating table to HW\n"); + goto clean_copy; + } } clean_copy: @@ -387,7 +409,8 @@ dr_rule_rehash_htbl(struct mlx5dr_rule *rule, nic_matcher = nic_rule->nic_matcher; nic_dmn = nic_matcher->nic_tbl->nic_dmn; - ste_info = kzalloc(sizeof(*ste_info), GFP_KERNEL); + ste_info = mlx5dr_send_info_alloc(dmn, + nic_matcher->nic_tbl->nic_dmn->type); if (!ste_info) return NULL; @@ -473,13 +496,13 @@ free_ste_list: list_for_each_entry_safe(del_ste_info, tmp_ste_info, &rehash_table_send_list, send_list) { list_del(&del_ste_info->send_list); - kfree(del_ste_info); + mlx5dr_send_info_free(del_ste_info); } free_new_htbl: mlx5dr_ste_htbl_free(new_htbl); free_ste_info: - kfree(ste_info); + mlx5dr_send_info_free(ste_info); mlx5dr_info(dmn, "Failed creating rehash table\n"); return NULL; } @@ -512,11 +535,11 @@ dr_rule_handle_collision(struct mlx5dr_matcher *matcher, struct list_head *send_list) { struct mlx5dr_domain *dmn = matcher->tbl->dmn; - struct mlx5dr_ste_ctx *ste_ctx = dmn->ste_ctx; struct mlx5dr_ste_send_info *ste_info; struct mlx5dr_ste *new_ste; - ste_info = kzalloc(sizeof(*ste_info), GFP_KERNEL); + ste_info = mlx5dr_send_info_alloc(dmn, + nic_matcher->nic_tbl->nic_dmn->type); if (!ste_info) return NULL; @@ -524,8 +547,8 @@ dr_rule_handle_collision(struct mlx5dr_matcher *matcher, if (!new_ste) goto free_send_info; - if (dr_rule_append_to_miss_list(ste_ctx, new_ste, - miss_list, send_list)) { + if (dr_rule_append_to_miss_list(dmn, nic_matcher->nic_tbl->nic_dmn->type, + new_ste, miss_list, send_list)) { mlx5dr_dbg(dmn, "Failed to update prev miss_list\n"); goto err_exit; } @@ -541,7 +564,7 @@ dr_rule_handle_collision(struct mlx5dr_matcher *matcher, err_exit: mlx5dr_ste_free(new_ste, matcher, nic_matcher); free_send_info: - kfree(ste_info); + mlx5dr_send_info_free(ste_info); return NULL; } @@ -721,8 +744,8 @@ static int dr_rule_handle_action_stes(struct mlx5dr_rule *rule, list_add_tail(&action_ste->miss_list_node, mlx5dr_ste_get_miss_list(action_ste)); - ste_info_arr[k] = kzalloc(sizeof(*ste_info_arr[k]), - GFP_KERNEL); + ste_info_arr[k] = mlx5dr_send_info_alloc(dmn, + nic_matcher->nic_tbl->nic_dmn->type); if (!ste_info_arr[k]) goto err_exit; @@ -759,7 +782,6 @@ static int dr_rule_handle_empty_entry(struct mlx5dr_matcher *matcher, { struct mlx5dr_domain *dmn = matcher->tbl->dmn; struct mlx5dr_ste_send_info *ste_info; - u64 icm_addr; /* Take ref on table, only on first time this ste is used */ mlx5dr_htbl_get(cur_htbl); @@ -767,12 +789,12 @@ static int dr_rule_handle_empty_entry(struct mlx5dr_matcher *matcher, /* new entry -> new branch */ list_add_tail(&ste->miss_list_node, miss_list); - icm_addr = mlx5dr_icm_pool_get_chunk_icm_addr(nic_matcher->e_anchor->chunk); - mlx5dr_ste_set_miss_addr(dmn->ste_ctx, hw_ste, icm_addr); + dr_rule_set_last_ste_miss_addr(matcher, nic_matcher, hw_ste); ste->ste_chain_location = ste_location; - ste_info = kzalloc(sizeof(*ste_info), GFP_KERNEL); + ste_info = mlx5dr_send_info_alloc(dmn, + nic_matcher->nic_tbl->nic_dmn->type); if (!ste_info) goto clean_ste_setting; @@ -793,7 +815,7 @@ static int dr_rule_handle_empty_entry(struct mlx5dr_matcher *matcher, return 0; clean_ste_info: - kfree(ste_info); + mlx5dr_send_info_free(ste_info); clean_ste_setting: list_del_init(&ste->miss_list_node); mlx5dr_htbl_put(cur_htbl); @@ -1089,6 +1111,7 @@ dr_rule_create_rule_nic(struct mlx5dr_rule *rule, size_t num_actions, struct mlx5dr_action *actions[]) { + u8 hw_ste_arr_optimized[DR_RULE_MAX_STE_CHAIN_OPTIMIZED * DR_STE_SIZE] = {}; struct mlx5dr_ste_send_info *ste_info, *tmp_ste_info; struct mlx5dr_matcher *matcher = rule->matcher; struct mlx5dr_domain *dmn = matcher->tbl->dmn; @@ -1098,6 +1121,7 @@ dr_rule_create_rule_nic(struct mlx5dr_rule *rule, struct mlx5dr_ste_htbl *cur_htbl; struct mlx5dr_ste *ste = NULL; LIST_HEAD(send_ste_list); + bool hw_ste_arr_is_opt; u8 *hw_ste_arr = NULL; u32 new_hw_ste_arr_sz; int ret, i; @@ -1109,9 +1133,23 @@ dr_rule_create_rule_nic(struct mlx5dr_rule *rule, rule->flow_source)) return 0; - hw_ste_arr = kzalloc(DR_RULE_MAX_STE_CHAIN * DR_STE_SIZE, GFP_KERNEL); - if (!hw_ste_arr) - return -ENOMEM; + ret = mlx5dr_matcher_select_builders(matcher, + nic_matcher, + dr_rule_get_ipv(¶m->outer), + dr_rule_get_ipv(¶m->inner)); + if (ret) + return ret; + + hw_ste_arr_is_opt = nic_matcher->num_of_builders <= DR_RULE_MAX_STES_OPTIMIZED; + if (likely(hw_ste_arr_is_opt)) { + hw_ste_arr = hw_ste_arr_optimized; + } else { + hw_ste_arr = kzalloc((nic_matcher->num_of_builders + DR_ACTION_MAX_STES) * + DR_STE_SIZE, GFP_KERNEL); + + if (!hw_ste_arr) + return -ENOMEM; + } mlx5dr_domain_nic_lock(nic_dmn); @@ -1119,13 +1157,6 @@ dr_rule_create_rule_nic(struct mlx5dr_rule *rule, if (ret) goto free_hw_ste; - ret = mlx5dr_matcher_select_builders(matcher, - nic_matcher, - dr_rule_get_ipv(¶m->outer), - dr_rule_get_ipv(¶m->inner)); - if (ret) - goto remove_from_nic_tbl; - /* Set the tag values inside the ste array */ ret = mlx5dr_ste_build_ste_arr(matcher, nic_matcher, param, hw_ste_arr); if (ret) @@ -1187,7 +1218,8 @@ dr_rule_create_rule_nic(struct mlx5dr_rule *rule, mlx5dr_domain_nic_unlock(nic_dmn); - kfree(hw_ste_arr); + if (unlikely(!hw_ste_arr_is_opt)) + kfree(hw_ste_arr); return 0; @@ -1196,7 +1228,7 @@ free_rule: /* Clean all ste_info's */ list_for_each_entry_safe(ste_info, tmp_ste_info, &send_ste_list, send_list) { list_del(&ste_info->send_list); - kfree(ste_info); + mlx5dr_send_info_free(ste_info); } remove_from_nic_tbl: @@ -1205,7 +1237,10 @@ remove_from_nic_tbl: free_hw_ste: mlx5dr_domain_nic_unlock(nic_dmn); - kfree(hw_ste_arr); + + if (unlikely(!hw_ste_arr_is_opt)) + kfree(hw_ste_arr); + return ret; } |