summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
AgeCommit message (Collapse)AuthorFilesLines
2024-03-12mlxsw: spectrum: Allow fetch-and-clear of flow countersPetr Machata1-2/+2
For the report_delta-like interface like a previous patch has added for collection of NH group statistics, it's easiest to read the counter and have the HW clear it right away. Thus, change mlxsw_sp_flow_counter_get() to take a bool indicating whether this should be done. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/6a096ede8ee92d5041e3832242c3bbc137198aba.1709901020.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-30mlxsw: spectrum: Query max_lag onceAmit Cohen1-0/+1
The maximum number of LAGs is queried from core several times. It is used to allocate LAG array, and then to iterate over it. In addition, it is used for PGT initialization. To simplify the code, instead of querying it several times, store the value as part of 'mlxsw_sp' and use it. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-01-30mlxsw: spectrum: Change mlxsw_sp_upper to LAG structureAmit Cohen1-12/+2
The structure mlxsw_sp_upper is used only as LAG. Rename it to mlxsw_sp_lag and move it to spectrum.c file, as it is used only there. Move the function mlxsw_sp_lag_get() with the structure. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-12-15mlxsw: spectrum_fid: Add an "any" packet typePetr Machata1-0/+2
Flood profiles have been used prior to CFF support for NVE underlay. Like is the case with FID flooding, an NVE profile describes at which offset a datum is located given traffic type. mlxsw currently only ever uses one KVD entry for NVE lookup, i.e. regardless of traffic type, the offset is always zero. To be able to describe this, add a traffic type enumerator describing "any traffic type". Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-30mlxsw: spectrum_fid: Add hooks for RSP table maintenancePetr Machata1-0/+2
In the CFF flood mode, the driver has to allocate a table within PGT, which holds flood vectors for router subport FIDs. For LAGs, these flood vectors have to obviously be maintained dynamically as port membership in a LAG changes. But even for physical ports, the flood vectors have to be kept valid, and may not contain enabled bits corresponding to non-existent ports. It is therefore not possible to precompute the port part of the RSP table, it has to be maintained as ports come and go due to splits. To support the RSP table maintenance, add to FID ops two new ops: fid_port_init and fid_port_fini, for when a port comes to existence, or joins a lag, and vice versa. Invoke these ops from mlxsw_sp_port_fids_init() and mlxsw_sp_port_fids_fini(), which are called when port is added and removed, respectively. Also add two new hooks for LAG maintenance, mlxsw_sp_fid_port_join_lag() / _leave_lag() which transitively call into the same ops. Later patches will actually add the op implementations themselves, this just adds the scaffolding. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/234398a23540317abb25f74f920a5c8121faecf0.1701183892.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-30mlxsw: spectrum_fid: Add a not-UC packet typePetr Machata1-0/+2
In CFF flood mode, the rFID family will allocate two tables. One for unknown UC traffic, one for everything else. Add a traffic type for the everything else traffic. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/8fb968b2d1cc37137cd0110c98cdeb625b03ca99.1701183892.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-30mlxsw: spectrum_fid: Privatize FID familiesPetr Machata1-5/+8
Currently, mlxsw always uses a "controlled" flood mode on all Nvidia Spectrum generations. The following patches will however introduce a possibility to run a "CFF" (for Compressed FID Flooding) mode on newer machines, if the FW supports it. Several operations will differ between how they need to be done in controlled mode vs. CFF mode. Thus the per-FID-family ops will differ between controlled and CFF, thus the FID family array as such will differ depending on whether the mode negotiated with FW is controlled or CFF. The simple approach of having several globally visible arrays for spectrum.c to statically choose from no longer works. Instead privatize all FID initialization and finalization logic, and expose it as ops instead. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/d3fa390d97cf3dbd2f7a28741be69b311e2059e4.1701183891.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-22mlxsw: spectrum_router: Add a helper to get subport number from a RIFPetr Machata1-0/+2
In the CFF flood mode, responsibility for management of the PGT entries for rFIDs is moved from FW to the driver. All rFIDs are based off either a front panel port, or a LAG port. The flood vectors for port-based rFIDs enable just the port itself, the ones for LAG-based rFIDs enable all member ports of the LAG in question. Since all rFIDs based off the same port have the same flood vector, and similarly for LAG-based rFIDs, the flood entries are shared. The PGT address of the flood vector is therefore determined based on the port (or LAG) number of the RIF connected with the rFID. Add a helper to determine subport number given a RIF, to be used in these calculations. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/d7ab43cf5b021f785f363f236e4b6780d10eea93.1700503644.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-20mlxsw: spectrum: Allocate LAG table when in SW LAG modePetr Machata1-0/+1
In this patch, if the LAG mode is SW, allocate the LAG table and configure SGCR to indicate where it was allocated. We use the default "DDD" (for dynamic data duplication) layout of the LAG table. In the DDD mode, the membership information for each LAG is copied in 8 PGT entries. This is done for performance reasons. The LAG table then needs to be allocated on an address aligned to 8. Deal with this by moving the LAG init ahead so that the LAG table is allocated at address 0. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20mlxsw: spectrum_pgt: Generalize PGT allocationPetr Machata1-1/+1
PGT blocks are allocated through the function mlxsw_sp_pgt_mid_alloc_range(). The interface assumes that the caller knows which piece of PGT exactly they want to get. That was fine while the FID code was the only client allocating blocks of PGT. However for SW-allocated LAG table, there will be an additional client: mlxsw_sp_lag_init(). The interface should therefore be changed to not require particular coordinates, but to take just the requested size, allocate the block wherever, and give back the PGT address. In this patch, change the interface accordingly. Initialize FID family's pgt_base from the result of the PGT allocation (note that mlxsw makes a copy of the family structure, so what gets initialized is not actually the global structure). Drop the now-unnecessary pgt_base initializations and the corresponding defines. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-14mlxsw: spectrum: Stop ignoring learning notifications from redirected trafficIdo Schimmel1-1/+0
As explained in the previous patch, with the ignore action prepended to the redirect action, it is not longer possible for redirected traffic to generate learning notifications. Therefore, remove the workaround that was added in commit 577fa14d2100 ("mlxsw: spectrum: Do not process learned records with a dummy FID") as it is no longer needed. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-14mlxsw: spectrum_flower: Disable learning and security lookup when redirectingIdo Schimmel1-0/+3
It is possible to add a filter that redirects traffic from the ingress of a bridge port that is locked (i.e., performs security / SMAC lookup) and has learning enabled. For example: # ip link add name br0 type bridge # ip link set dev swp1 master br0 # bridge link set dev swp1 learning on locked on mab on # tc qdisc add dev swp1 clsact # tc filter add dev swp1 ingress pref 1 proto ip flower skip_sw src_ip 192.0.2.1 action mirred egress redirect dev swp2 In the kernel's Rx path, this filter is evaluated before the Rx handler of the bridge, which means that redirected traffic should not be affected by bridge port configuration such as learning. However, the hardware data path is a bit different and the redirect action (FORWARDING_ACTION in hardware) merely attaches a pointer to the packet, which is later used by the L2 lookup stage to understand how to forward the packet. Between both stages - ingress ACL and L2 lookup - learning and security lookup are performed, which means that redirected traffic is affected by bridge port configuration, unlike in the kernel's data path. The learning discrepancy was handled in commit 577fa14d2100 ("mlxsw: spectrum: Do not process learned records with a dummy FID") by simply ignoring learning notifications generated by the redirected traffic. A similar solution is not possible for the security / SMAC lookup since - unlike learning - the CPU is not involved and packets that failed the lookup are dropped by the device. Instead, solve this by prepending the ignore action to the redirect action and use it to instruct the device to disable both learning and the security / SMAC lookup for redirected traffic. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-05mlxsw: spectrum: Remove unused function declarationsYue Haibing1-4/+0
Commit c3d2ed93b14d ("mlxsw: Remove old parsing depth infrastructure") left behind mlxsw_sp_nve_inc_parsing_depth_get()/mlxsw_sp_nve_inc_parsing_depth_put(). And commit 532b49e41e64 ("mlxsw: spectrum_span: Derive SBIB from maximum port speed & MTU") remove mlxsw_sp_span_port_mtu_update()/mlxsw_sp_span_speed_update_work() but leave the declarations. Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/20230803142047.42660-1-yuehaibing@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-07-28mlxsw: spectrum: Drop unused functions mlxsw_sp_port_lower_dev_hold/_put()Petr Machata1-2/+0
As of commit 151b89f6025a ("mlxsw: spectrum_router: Reuse work neighbor initialization in work scheduler"), the functions mlxsw_sp_port_lower_dev_hold() and mlxsw_sp_port_dev_put() have no users. Drop them. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/d0adcd7cb4ea19416294a0f861100edba84c9f36.1690471774.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-07-21mlxsw: spectrum_switchdev: Replay switchdev objects on port joinPetr Machata1-0/+2
Currently it never happens that a netdevice that is already a bridge slave would suddenly become mlxsw upper. The only case where this might be possible as far as mlxsw is concerned, is with LAG netdevices. But if a LAG has any upper (e.g. is enslaved), enlaving mlxsw port to that LAG is forbidden. Thus the only way to install a LAG between a bridge and a mlxsw port is by first enslaving the port to the LAG, and then enslaving that LAG to a bridge. At that point there are no bridge objects (such as port VLANs) to replay. Those are added afterwards, and notified as they are created. This holds even for the PVID. However in the following patches, the requirement that ports be only enslaved to masters without uppers, is going to be relaxed. It will therefore be necessary to replay the existing bridge objects. Without this replay, e.g. the mlxsw bridge_port_vlan objects are not instantiated, which causes issues later, as a lot of code relies on their presence. To that end, add a new notifier block whose sole role is to filter out events related to the one relevant upper, and forward those to the existing switchdev notifier block. Pass the new notifier block to switchdev_bridge_port_offload() when the bridge port is created. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-07-13mlxsw: spectrum_flower: Add ability to match on port rangesIdo Schimmel1-1/+5
Add the ability to match on port ranges by utilizing the previously added port range registers and the port range key element. Up to two port range registers can be used for each filter, one for source port and another for destination port. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Link: https://lore.kernel.org/r/df4385a9592917e9a22ebff339e0463e4a8dfa82.1689092769.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-07-13mlxsw: spectrum_acl: Pass main driver structure to mlxsw_sp_acl_rulei_destroy()Ido Schimmel1-1/+2
The main driver structure will be needed in this function by a subsequent patch, so pass it. No functional changes intended. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Link: https://lore.kernel.org/r/24d96a4e21310e5de2951ace58263db35e44a0df.1689092769.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-07-13mlxsw: spectrum_port_range: Add devlink resource supportIdo Schimmel1-0/+1
Expose via devlink-resource the maximum number of port range registers and their current occupancy. Besides the observability benefits, this resource will be used by subsequent patches for scale and occupancy tests. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Link: https://lore.kernel.org/r/7945e0c715dc5efb1617f45f7560c1f1bd0bcf8a.1689092769.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-07-13mlxsw: spectrum_port_range: Add port range coreIdo Schimmel1-0/+15
The Spectrum ASICs have a fixed number of port range registers, each of which maintains the following parameters: * Minimum and maximum port. * Apply port range for source port, destination port or both. * Apply port range for TCP, UDP or both. * Apply port range for IPv4, IPv6 or both. Implement a port range core which takes care of the allocation and configuration of these registers and exposes an API that allows in-driver consumers (e.g., the ACL code) to request matching on a range of either source or destination port. These registers are going to be used for port range matching in the flower classifier that already matches on EtherType being IPv4 / IPv6 and IP protocol being TCP / UDP. As such, there is no need to limit these registers to a specific EtherType or IP protocol, which will increase the likelihood of a register being shared by multiple flower filters. It is unlikely that a filter will match on the same range of both source and destination ports, which is why each register is only configured to match on either source or destination port. If a filter requires matching on a range of both source and destination ports, it will utilize two port range registers and match on the output of both. For efficient lookup and traversal, use XArray to store the allocated port range registers. The XArray uses RCU and an internal spinlock to synchronise access, so there is no need for a dedicate lock. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Link: https://lore.kernel.org/r/674f00539a0072d455847663b5feb504db51a259.1689092769.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-14mlxsw: spectrum_router: Add a helper specifically for joining a LAGPetr Machata1-4/+0
Currently, joining a LAG very simply means that the LAG RIF should be joined by the subport representing untagged traffic. If the RIF does not exist, it does not have to be created: if the user wants there to be RIF for the LAG device, they are supposed to add an IP address, and they are supposed to do it after tha LAG becomes mlxsw upper. We can also assume that the LAG has no uppers, otherwise the enslavement is not allowed. In the future, these ordering dependencies should be removed. That means that joining LAG will be more complex operation, possibly involving a lazy RIF creation, and possibly joining / lazily creating RIFs for VLAN uppers of the LAG. It will be handy to have a dedicated function that handles all this. The new function mlxsw_sp_router_port_join_lag() is that. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-12mlxsw: spectrum_router: Move here inetaddr validator notifiersPetr Machata1-4/+0
The validation logic is already in the router code. Move there the notifier blocks themselves as well. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-08mlxsw: spectrum_acl_tcam: Move devlink param to TCAM codeIdo Schimmel1-2/+1
Cited commit added 'DEVLINK_CMD_PARAM_DEL' notifications whenever the network namespace of the devlink instance is changed. Specifically, the notifications are generated after calling reload_down(), but before calling reload_up(). At this stage, the data structures accessed while reading the value of the "acl_region_rehash_interval" devlink parameter are uninitialized, resulting in a use-after-free [1]. Fix by moving the registration and unregistration of the devlink parameter to the TCAM code where it is actually used. This means that the parameter is unregistered during reload_down() and then re-registered during reload_up(), avoiding the use-after-free between these two operations. Reproducer: # ip netns add test123 # devlink dev reload pci/0000:06:00.0 netns test123 [1] BUG: KASAN: use-after-free in mlxsw_sp_acl_tcam_vregion_rehash_intrvl_get+0xb2/0xd0 Read of size 4 at addr ffff888162fd37d8 by task devlink/1323 [...] Call Trace: <TASK> dump_stack_lvl+0x95/0xbd print_report+0x181/0x4a1 kasan_report+0xdb/0x200 mlxsw_sp_acl_tcam_vregion_rehash_intrvl_get+0xb2/0xd0 mlxsw_sp_params_acl_region_rehash_intrvl_get+0x32/0x80 devlink_nl_param_fill.constprop.0+0x29a/0x11e0 devlink_param_notify.constprop.0+0xb9/0x250 devlink_notify_unregister+0xbc/0x470 devlink_reload+0x1aa/0x440 devlink_nl_cmd_reload+0x559/0x11b0 genl_family_rcv_msg_doit.isra.0+0x1f8/0x2e0 genl_rcv_msg+0x558/0x7f0 netlink_rcv_skb+0x170/0x440 genl_rcv+0x2d/0x40 netlink_unicast+0x53f/0x810 netlink_sendmsg+0x961/0xe80 __sys_sendto+0x2a4/0x420 __x64_sys_sendto+0xe5/0x1c0 do_syscall_64+0x38/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Fixes: 7d7e9169a3ec ("devlink: move devlink reload notifications back in between _down() and _up() calls") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-10mlxsw: spectrum: Add an API to configure security checksIdo Schimmel1-1/+4
Add an API to enable or disable security checks on a local port. It will be used by subsequent patches when the 'BR_PORT_LOCKED' flag is toggled. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-29mlxsw: Send PTP packets as data packets to overcome a limitationDanielle Ratson1-0/+10
In Spectrum-2 and Spectrum-3, the correction field of PTP packets which are sent as control packets is not updated at egress port. To overcome this limitation, PTP packets which require time stamp, should be sent as data packets with the following details: 1. FID valid = 1 2. FID value above the maximum FID 3. rx_router_port = 1 >From Spectrum-4 and on, this limitation will be solved. Extend the function which handles TX header, in case that the packet is a PTP packet, add TX header with type=data and all the above mentioned requirements. Add operation as part of 'struct mlxsw_sp_ptp_ops', to be able to separate the handling of PTP packets between different ASICs. Use the data packet solution only for Spectrum-2 and Spectrum-3. Therefore, add a dedicated operation structure for Spectrum-4, as it will be same to Spectrum-2 in PTP implementation, just will not have the limitation of control packets. Signed-off-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-04mlxsw: Enable unified bridge modelAmit Cohen1-7/+0
After all the preparations for unified bridge model, finally flip mlxsw driver to use the new model. Change config profile, set 'ubridge' to true and remove the configurations that are relevant only for the legacy model. Set 'flood_mode' to 'controlled' as the current mode is not supported with unified bridge model. Remove all the code which is dedicated to the legacy model. Remove 'struct mlxsw_sp.ubridge' variable which was temporarily added to separate configurations between the models. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-04mlxsw: Add new FID families for unified bridge modelAmit Cohen1-0/+4
In the unified bridge model, mlxsw will no longer emulate 802.1Q FIDs using 802.1D FIDs. The new FID table will look as follows: +---------------+ | 802.1q FIDs | 4K entries | [1..4094] | +---------------+ | 802.1d FIDs | 1K entries | [4095..5118] | +---------------+ | Dummy FIDs | 1 entry | [5119..5119] | +---------------+ | rFIDs | 11K entries | [5120..16383] | +---------------+ In order to make the change easier to review, four new temporary FID families will be added (e.g., MLXSW_SP_FID_TYPE_8021D_UB) and will not be registered with the FID core until mlxsw is flipped to use the unified bridge model. Add .1d, rfid and dummy FID families for unified bridge, the next patch will add .1q family separately as it requires more changes. The following changes are required: 1. Add 'smpe_index_valid' field to 'struct mlxsw_sp_fid_family' and set SFMR.smpe accordingly. SMPE index is reserved for rFIDs, as their flooding is handled by firmware, and always reserved in Spectrum-1, as it is configured as part of PGT table. 2. Add 'ubridge' field to 'struct mlxsw_sp_fid_family'. This field will be removed later, use it in mlxsw_sp_fid_family_{register,unregister}() to skip the registration / unregistration of the new families when the legacy model is used. 3. Indexes - the start and end indexes of each FID family will need to be changed according to the above diagram. 4. Add flood tables for unified bridge model, use 'fid_offset' as table type, as in the new model the access to flood tables will be using 'fid_offset' calculation. 5. FID family operation changes: a. rFID supposed to be created using SFMR, as it is not created by firmware using unified bridge model. b. port_vid_map() should perform SVFA for rFID, as the mapping is not created by firmware using unified bridge model. c. flood_index() is not aligned to the new model, as this function will be removed later. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-04mlxsw: Add support for VLAN RIFsAmit Cohen1-0/+1
Router interfaces (RIFs) constructed on top of VLAN-aware bridges are of 'VLAN' type, whereas RIFs constructed on top of VLAN-unaware bridges are of 'FID' type. Currently 802.1Q FIDs are emulated using 802.1D FIDs, therefore VLAN RIFs are emulated using FID RIFs. As part of converting the driver to use unified bridge model, 802.1Q FIDs and VLAN RIFs will be used. The egress FID is required for VLAN RIFs in Spectrum-2 and above, but not in Spectrum-1, as in Spectrum-1 the mapping for VLAN RIFs is VID->FID, while in other ASICs it is FID->FID. The reason for the change is that it is more scalable to reuse the FID->FID entry than creating multiple {Port, VID}->FID entries for the router port. Use the existing operation structure to separate the configuration between different ASICs. Add support for VLAN RIFs, most of the configurations are same to FID RIFs. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-04mlxsw: Configure ingress RIF classificationAmit Cohen1-1/+3
Before layer 2 forwarding, the device classifies an incoming packet to a FID. The classification is done based on one of the following keys: 1. FID 2. VNI (after decapsulation) 3. VID / {Port, VID} After classification, the FID is known, but also all the attributes of the FID, such as the router interface (RIF) via which a packet that needs to be routed will ingress the router block. In the legacy model, when a RIF was created / destroyed, it was firmware's responsibility to update it in the previously mentioned FID classification records. In the unified bridge model, this responsibility moved to software. The third classification requires to iterate over the FID's {Port, VID} list and issue SVFA write with the correct mapping table according to the port's mode (virtual or not). We never map multiple VLANs to the same FID using VID->FID mapping, so such a mapping needs to be performed once. When a new FID classification entry is configured and the FID already has a RIF, set the RIF as part of SVFA configuration. The reverse needs to be done when clearing a RIF from a FID. Currently, clearing is done by issuing mlxsw_sp_fid_rif_set() with a NULL RIF pointer. Instead, introduce mlxsw_sp_fid_rif_unset(). Note that mlxsw_sp_fid_rif_set() is called after the RIF is fully operational, so it conforms to the internal requirement regarding SVFA.irif_v: "Must not be set for a non-enabled RIF". Do not set the ingress RIF for rFIDs, as the {Port, VID}->rFID entry is configured by firmware when legacy model is used, a next patch will handle this configuration for rFIDs and unified bridge model. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-29mlxsw: spectrum_switchdev: Rename MID structureAmit Cohen1-9/+0
Currently the structure which represents MDB entry is called 'struct mlxsw_sp_mid'. This name is not accurate as a MID entry stores a bitmap of ports to which a packet needs to be replicated and a MDB entry stores the mapping from {MAC, FID} to PGT index (MID). Rename the structure to 'struct mlxsw_sp_mdb_entry'. The structure 'mlxsw_sp_mid' is defined as part of spectrum.h. The only file which uses it is spectrum_switchdev.c, so there is no reason to expose it to other files. Move the definition to spectrum_switchdev.c. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-29mlxsw: Align PGT index to legacy bridge modelAmit Cohen1-0/+1
FID code reserves about 15K entries in PGT table for flooding. These entries are just allocated and are not used yet because the code that uses them is skipped now. The next patches will convert MDB code to use PGT APIs. The allocation of indexes for multicast is done after FID code reserves 15K entries. Currently, legacy bridge model is used and firmware manages PGT table. That means that the indexes which are allocated using PGT API are too high when legacy bridge model is used. To not exceed firmware limitation for MDB entries, add an API that returns the correct 'mid_index', based on bridge model. For legacy model, subtract the number of flood entries from PGT index. Use it to write the correct MID to SMID register. This API will be used also from MDB code in the next patches. PGT should not be aware of MDB and FID different usage, this API is temporary and will be removed once unified bridge model will be used. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-28mlxsw: Extend PGT APIs to support maintaining list of ports per entryAmit Cohen1-0/+2
Add an API to associate a PGT entry with SMPE index and add or remove a port. This API will be used by FID code and MDB code, to add/remove port from specific PGT entry. When the first port is added to PGT entry, allocate the entry in the given MID index, when the last port is removed from PGT entry, free it. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-06-28mlxsw: Add a dedicated structure for bitmap of portsAmit Cohen1-0/+25
Currently when bitmap of ports is needed, 'unsigned long *' type is used. The functions which use the bitmap assume its length according to its name, i.e., each function which gets a bitmap of ports queries the maximum number of ports and uses it as the size. As preparation for the next patch which will use bitmap of ports, add a dedicated structure for it. Refactor the existing code to use the new structure. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-06-28mlxsw: Add an indication of SMPE index validity for PGT tableAmit Cohen1-0/+1
In Spectrum-1, the index into the MPE table - called switch multicast to port egress VID (SMPE) - is derived from the PGT entry, whereas in Spectrum-2 and later ASICs it is derived from the FID. Therefore, in Spectrum-1, the SMPE index needs to be programmed as part of the PGT entry via SMID register, while it is reserved for Spectrum-2 and later ASICs. Add 'pgt_smpe_index_valid' boolean as part of 'struct mlxsw_sp' and set it to true for Spectrum-1 and to false for the later ASICs. Add 'smpe_index_valid' as part of 'struct mlxsw_sp_pgt' and set it according to the value in 'struct mlxsw_sp' as part of PGT initialization. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-06-28mlxsw: Add an initial PGT table supportAmit Cohen1-0/+12
The PGT (Port Group Table) table maps an index to a bitmap of local ports to which a packet needs to be replicated. This table is used for layer 2 multicast and flooding. In the legacy model, software did not interact with this table directly. Instead, it was accessed by firmware in response to registers such as SFTR and SMID. In the new model, the SFTR register is deprecated and software has full control over the PGT table using the SMID register. The entire state of the PGT table needs to be maintained in software because member ports in a PGT entry needs to be reference counted to avoid releasing entries which are still in use. Add the following APIs: 1. mlxsw_sp_pgt_{init, fini}() - allocate/free the PGT table. 2. mlxsw_sp_pgt_mid_alloc_range() - allocate a range of MID indexes in PGT. To be used by FID code during initialization to reserve specific PGT indexes for flooding entries. 3. mlxsw_sp_pgt_mid_free_range() - free indexes in a given range. 4. mlxsw_sp_pgt_mid_alloc() - allocate one MID index in the PGT at a non-specific range, just search for free index. To be used by MDB code. 5. mlxsw_sp_pgt_mid_free() - free the given index. Note that alloc() functions do not allocate the entries in software, just allocate IDs using 'idr'. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-06-28mlxsw: spectrum: Add a temporary variable to indicate bridge modelAmit Cohen1-0/+1
As part of transition to unified bridge model, many different firmware configurations are done. Some of the configuration that needs to be done for the unified bridge model is not valid under the legacy model, and would be rejected by the firmware. At the same time, the driver cannot switch to the unified bridge model until all of the code has been converted. To allow breaking the change into patches, and to not break driver behavior during the transition, add a boolean variable to indicate bridge model. Then, forbidden configurations will be skipped using the check - "if (!mlxsw_sp->ubridge)". The new variable is temporary for several sets, it will be removed when firmware will be configured to work with unified bridge model. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-06-24mlxsw: spectrum: Rename MLXSW_SP_RIF_TYPE_VLANAmit Cohen1-1/+1
Currently, the driver emulates 802.1Q FIDs using 802.1D FIDs. As such, the RIFs configured on top of these FIDs are FID RIFs and not VLAN RIFs. As part of converting the driver to the unified bridge model, 802.1Q FIDs and VLAN RIFs will be used. As a preparation for this change, rename the emulated VLAN RIFs from 'MLXSW_SP_RIF_TYPE_VLAN' to 'MLXSW_SP_RIF_TYPE_VLAN_EMU'. After the conversion the emulated VLAN RIFs will be removed. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-24mlxsw: spectrum: Use different arrays of FID families per-ASIC typeAmit Cohen1-0/+4
Egress VID for layer 2 multicast is determined from two tables, the MPE and PGT tables. The MPE table is a two dimensional table indexed by local port and SMPE index, which should be thought of as a FID index. In Spectrum-1 the SMPE index is derived from the PGT entry, whereas in Spectrum-2 and newer ASICs the SMPE index is a FID attribute configured via the SFMR register. The validity of the SMPE index in SFMR is influenced from two factors: 1. FID family. SMPE index is reserved for rFIDs, as their flooding is handled by firmware. 2. ASIC generation. SMPE index is always reserved for Spectrum-1. As such, the validity of the SMPE index should be an attribute of the FID family and have different arrays of FID families per-ASIC type. As a preparation for SMPE index configuration, create separate arrays of FID families for different ASICs. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-22mlxsw: Remove lag_vid_valid indicationAmit Cohen1-1/+0
Currently 'struct mlxsw_sp_fid_family' has a field which indicates if 'lag_vid' is valid for use in SFD register. This is a leftover from using .1Q FIDs instead of emulating them using .1D FIDs. Currently when .1Q FIDs are emulated using .1D FIDs, this field is true for both families, so there is no reason to maintain it. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-17mlxsw: Add a resource describing number of RIFsPetr Machata1-0/+1
The Spectrum ASIC has a limit on how many L3 devices (called RIFs) can be created. The limit depends on the ASIC and FW revision, and mlxsw reads it from the FW. In order to communicate both the number of RIFs that there can be, and how many are taken now (i.e. occupancy), introduce a corresponding devlink resource. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-08mlxsw: spectrum: Move handling of tunnel events to router codePetr Machata1-13/+0
The events related to IPIP tunnels are handled by the router code. Move the handling from the central dispatcher in spectrum.c to the new notifier handler in the router module. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-08mlxsw: spectrum: Move handling of router events to router codePetr Machata1-2/+0
The events NETDEV_PRE_CHANGEADDR, NETDEV_CHANGEADDR and NETDEV_CHANGEMTU have implications for in-ASIC router interface objects, and as such are handled in the router module. Move the handling from the central dispatcher in spectrum.c to the new notifier handler in the router module. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-08mlxsw: spectrum: Move handling of VRF events to router codePetr Machata1-2/+0
Events involving VRF, as L3 concern, are handled in the router code, by the helper mlxsw_sp_netdevice_vrf_event(). The handler is currently invoked from the centralized dispatcher in spectrum.c. Instead, move the call to the newly-introduced router-specific notifier handler. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-18mlxsw: spectrum: Add port to linecard mappingJiri Pirko1-0/+1
For each port get slot_index using PMLP register. For ports residing on a linecard, identify it with the linecard by setting mapping using devlink_port_linecard_set() helper. Use linecard slot index for PMTDB register queries. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-18mlxsw: spectrum: Introduce port mapping change event processingJiri Pirko1-0/+7
Register PMLPE trap and process the port mapping changes delivered by it by creating related ports. Note that this happens after provisioning. The INI of the linecard is processed and merged by FW. PMLPE is generated for each port. Process this mapping change. Layout of PMLPE is the same as layout of PMLP. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-18mlxsw: spectrum: Allocate port mapping array of structs instead of pointersJiri Pirko1-1/+1
Instead of array of pointers to port mapping structures, allocate the array of structures directly. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-07mlxsw: Support FLOW_ACTION_MANGLE for SIP and DIP IPv6 addressesDanielle Ratson1-1/+24
Spectrum-2 supports an ACL action SIP_DIP, which allows IPv4 and IPv6 source and destination addresses change. Offload suitable mangles to the IPv6 address change action. Signed-off-by: Danielle Ratson <danieller@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-28mlxsw: spectrum: Guard against invalid local portsAmit Cohen1-0/+7
When processing events generated by the device's firmware, the driver protects itself from events reported for non-existent local ports, but not for the CPU port (local port 0), which exists, but does not have all the fields as any local port. This can result in a NULL pointer dereference when trying access 'struct mlxsw_sp_port' fields which are not initialized for CPU port. Commit 63b08b1f6834 ("mlxsw: spectrum: Protect driver from buggy firmware") already handled such issue by bailing early when processing a PUDE event reported for the CPU port. Generalize the approach by moving the check to a common function and making use of it in all relevant places. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-07mlxsw: spectrum_acl_bloom_filter: Add support for Spectrum-4 calculationAmit Cohen1-0/+1
Spectrum-4 will calculate hash function for bloom filter differently from the existing ASICs. First, two hash functions will be used to calculate 16 bits result. The final result will be combination of the two results - 6 bits which are result of CRC-6 will be used as MSB and 10 bits which are result of CRC-10 will be used as LSB. Second, while in Spectrum{2,3}, there is a padding in each chunk, so the chunks use a sequence of whole bytes, in Spectrum-4 there is no padding, so each chunk use 20 bytes minus 2 bits, so it is necessary to align the chunks to be without holes. Add dedicated 'mlxsw_sp_acl_bf_ops' for Spectrum-4 and add the required tables for CRC calculations. All the details are documented as part of the code for future use. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-07mlxsw: Add operations structure for bloom filter calculationAmit Cohen1-0/+4
Spectrum-4 will calculate hash function for bloom filter differently from the existing ASICs. There are two changes: 1. Instead of using one hash function to calculate 16 bits output (CRC-16), two functions will be used. 2. The chunks will be built differently, without padding. As preparation for support of Spectrum-4 bloom filter, add 'ops' structure to allow handling different calculation for different ASICs. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-07mlxsw: Introduce flex key elements for Spectrum-4Amit Cohen1-0/+1
Spectrum-4 ASIC will support more virtual routers and local ports compared to the existing ASICs. Therefore, the virtual router and local port ACL key elements need to be increased. Introduce new key elements for Spectrum-4 to be aligned with the elements used already for other Spectrum ASICs. The key blocks layout is the same for Spectrum-4, so use the existing code for encode_block() and clear_block(), just create separate blocks. Note that size of `VIRT_ROUTER_MSB` is 4 bits in Spectrum-4, therefore declare it using `MLXSW_AFK_ELEMENT_INST_U32()`, in order to be able to set `.avoid_size_check` to true. Otherwise, `mlxsw_afk_blocks_check()` will fail and warn. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>