summaryrefslogtreecommitdiff
path: root/drivers/net
AgeCommit message (Collapse)AuthorFilesLines
2020-05-11sfc: move vport_id to struct efx_nicEdward Cree7-38/+29
Remove some usage of ef10-specific nic_data structs from common MCDI functions, in preparation for using them from a non-EF10 driver. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-11net: qed: Disable SRIOV functionality inside kdump kernelBhupesh Sharma2-4/+8
Since we have kdump kernel(s) running under severe memory constraint it makes sense to disable the qed SRIOV functionality when running the kdump kernel as kdump configurations on several distributions don't support SRIOV targets for saving the vmcore (see [1] for example). Currently the qed SRIOV functionality ends up consuming memory in the kdump kernel, when we don't really use the same. An example log seen in the kdump kernel with the SRIOV functionality enabled can be seen below (obtained via memstrack tool, see [2]): dracut-pre-pivot[676]: ======== Report format module_summary: ======== dracut-pre-pivot[676]: Module qed using 149.6MB (2394 pages), peak allocation 149.6MB (2394 pages) This patch disables the SRIOV functionality inside kdump kernel and with the same applied the memory consumption goes down: dracut-pre-pivot[671]: ======== Report format module_summary: ======== dracut-pre-pivot[671]: Module qed using 124.6MB (1993 pages), peak allocation 124.7MB (1995 pages) [1]. https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/managing_monitoring_and_updating_the_kernel/installing-and-configuring-kdump_managing-monitoring-and-updating-the-kernel#supported-kdump-targets_supported-kdump-configurations-and-targets [2]. Memstrack tool: https://github.com/ryncsn/memstrack Cc: kexec@lists.infradead.org Cc: linux-kernel@vger.kernel.org Cc: Ariel Elior <aelior@marvell.com> Cc: GR-everest-linux-l2@marvell.com Cc: Manish Chopra <manishc@marvell.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-11net: qed*: Reduce RX and TX default ring count when running inside kdump kernelBhupesh Sharma2-2/+11
Normally kdump kernel(s) run under severe memory constraint with the basic idea being to save the crashdump vmcore reliably when the primary kernel panics/hangs. Currently the qed* ethernet driver ends up consuming a lot of memory in the kdump kernel, leading to kdump kernel panic when one tries to save the vmcore via ssh/nfs (thus utilizing the services of the underlying qed* network interfaces). An example OOM message log seen in the kdump kernel can be seen here [1], with crashkernel size reservation of 512M. Using tools like memstrack (see [2]), we can track the modules taking up the bulk of memory in the kdump kernel and organize the memory usage output as per 'highest allocator first'. An example log for the OOM case indicates that the qed* modules end up allocating approximately 216M memory, which is a large part of the total crashkernel size: dracut-pre-pivot[676]: ======== Report format module_summary: ======== dracut-pre-pivot[676]: Module qed using 149.6MB (2394 pages), peak allocation 149.6MB (2394 pages) dracut-pre-pivot[676]: Module qede using 65.3MB (1045 pages), peak allocation 65.3MB (1045 pages) This patch reduces the default RX and TX ring count from 1024 to 64 when running inside kdump kernel, which leads to a significant memory saving. An example log with the patch applied shows the reduced memory allocation in the kdump kernel: dracut-pre-pivot[674]: ======== Report format module_summary: ======== dracut-pre-pivot[674]: Module qed using 141.8MB (2268 pages), peak allocation 141.8MB (2268 pages) <..snip..> [dracut-pre-pivot[674]: Module qede using 4.8MB (76 pages), peak allocation 4.9MB (78 pages) Tested crashdump vmcore save via ssh/nfs protocol using underlying qed* network interface after applying this patch. [1] OOM log: ------------ kworker/0:6: page allocation failure: order:6, mode:0x60c0c0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO), nodemask=(null) kworker/0:6 cpuset=/ mems_allowed=0 CPU: 0 PID: 145 Comm: kworker/0:6 Not tainted 4.18.0-109.el8.aarch64 #1 Hardware name: To be filled by O.E.M. Saber/Saber, BIOS 0ACKL025 01/18/2019 Workqueue: events work_for_cpu_fn Call trace: dump_backtrace+0x0/0x188 show_stack+0x24/0x30 dump_stack+0x90/0xb4 warn_alloc+0xf4/0x178 __alloc_pages_nodemask+0xcac/0xd58 alloc_pages_current+0x8c/0xf8 kmalloc_order_trace+0x38/0x108 qed_iov_alloc+0x40/0x248 [qed] qed_resc_alloc+0x224/0x518 [qed] qed_slowpath_start+0x254/0x928 [qed] __qede_probe+0xf8/0x5e0 [qede] qede_probe+0x68/0xd8 [qede] local_pci_probe+0x44/0xa8 work_for_cpu_fn+0x20/0x30 process_one_work+0x1ac/0x3e8 worker_thread+0x44/0x448 kthread+0x130/0x138 ret_from_fork+0x10/0x18 Cannot start slowpath qede: probe of 0000:05:00.1 failed with error -12 [2]. Memstrack tool: https://github.com/ryncsn/memstrack Cc: kexec@lists.infradead.org Cc: linux-kernel@vger.kernel.org Cc: Ariel Elior <aelior@marvell.com> Cc: GR-everest-linux-l2@marvell.com Cc: Manish Chopra <manishc@marvell.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-11hinic: add link_ksettings ethtool_ops supportLuo bin6-9/+682
add set_link_ksettings implementation and improve the implementation of get_link_ksettings Signed-off-by: Luo bin <luobin9@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-11net: atarilance: Replace zero-length array with flexible-arrayGustavo A. R. Silva1-1/+1
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] sizeof(flexible-array-member) triggers a warning because flexible array members have incomplete type[1]. There are some instances of code in which the sizeof operator is being incorrectly/erroneously applied to zero-length arrays and the result is zero. Such instances may be hiding some bugs. So, this work (flexible-array member conversions) will also help to get completely rid of those sorts of issues. This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-11net: dsa: sja1105: implement cross-chip bridging operationsVladimir Oltean2-0/+92
sja1105 uses dsa_8021q for DSA tagging, a format which is VLAN at heart and which is compatible with cascading. A complete description of this tagging format is in net/dsa/tag_8021q.c, but a quick summary is that each external-facing port tags incoming frames with a unique pvid, and this special VLAN is transmitted as tagged towards the inside of the system, and as untagged towards the exterior. The tag encodes the switch id and the source port index. This means that cross-chip bridging for dsa_8021q only entails adding the dsa_8021q pvids of one switch to the RX filter of the other switches. Everything else falls naturally into place, as long as the bottom-end of ports (the leaves in the tree) is comprised exclusively of dsa_8021q-compatible (i.e. sja1105 switches). Otherwise, there would be a chance that a front-panel switch transmits a packet tagged with a dsa_8021q header, header which it wouldn't be able to remove, and which would hence "leak" out. The only use case I tested (due to lack of board availability) was when the sja1105 switches are part of disjoint trees (however, this doesn't change the fact that multiple sja1105 switches still need unique switch identifiers in such a system). But in principle, even "true" single-tree setups (with DSA links) should work just as fine, except for a small change which I can't test: dsa_towards_port should be used instead of dsa_upstream_port (I made the assumption that the routing port that any sja1105 should use towards its neighbours is the CPU port. That might not hold true in other setups). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-11net: dsa: permit cross-chip bridging between all trees in the systemVladimir Oltean1-4/+12
One way of utilizing DSA is by cascading switches which do not all have compatible taggers. Consider the following real-life topology: +---------------------------------------------------------------+ | LS1028A | | +------------------------------+ | | | DSA master for Felix | | | |(internal ENETC port 2: eno2))| | | +------------+------------------------------+-------------+ | | | Felix embedded L2 switch | | | | | | | | +--------------+ +--------------+ +--------------+ | | | | |DSA master for| |DSA master for| |DSA master for| | | | | | SJA1105 1 | | SJA1105 2 | | SJA1105 3 | | | | | |(Felix port 1)| |(Felix port 2)| |(Felix port 3)| | | +--+-+--------------+---+--------------+---+--------------+--+--+ +-----------------------+ +-----------------------+ +-----------------------+ | SJA1105 switch 1 | | SJA1105 switch 2 | | SJA1105 switch 3 | +-----+-----+-----+-----+ +-----+-----+-----+-----+ +-----+-----+-----+-----+ |sw1p0|sw1p1|sw1p2|sw1p3| |sw2p0|sw2p1|sw2p2|sw2p3| |sw3p0|sw3p1|sw3p2|sw3p3| +-----+-----+-----+-----+ +-----+-----+-----+-----+ +-----+-----+-----+-----+ The above can be described in the device tree as follows (obviously not complete): mscc_felix { dsa,member = <0 0>; ports { port@4 { ethernet = <&enetc_port2>; }; }; }; sja1105_switch1 { dsa,member = <1 1>; ports { port@4 { ethernet = <&mscc_felix_port1>; }; }; }; sja1105_switch2 { dsa,member = <2 2>; ports { port@4 { ethernet = <&mscc_felix_port2>; }; }; }; sja1105_switch3 { dsa,member = <3 3>; ports { port@4 { ethernet = <&mscc_felix_port3>; }; }; }; Basically we instantiate one DSA switch tree for every hardware switch in the system, but we still give them globally unique switch IDs (will come back to that later). Having 3 disjoint switch trees makes the tagger drivers "just work", because net devices are registered for the 3 Felix DSA master ports, and they are also DSA slave ports to the ENETC port. So packets received on the ENETC port are stripped of their stacked DSA tags one by one. Currently, hardware bridging between ports on the same sja1105 chip is possible, but switching between sja1105 ports on different chips is handled by the software bridge. This is fine, but we can do better. In fact, the dsa_8021q tag used by sja1105 is compatible with cascading. In other words, a sja1105 switch can correctly parse and route a packet containing a dsa_8021q tag. So if we could enable hardware bridging on the Felix DSA master ports, cross-chip bridging could be completely offloaded. Such as system would be used as follows: ip link add dev br0 type bridge && ip link set dev br0 up for port in sw0p0 sw0p1 sw0p2 sw0p3 \ sw1p0 sw1p1 sw1p2 sw1p3 \ sw2p0 sw2p1 sw2p2 sw2p3; do ip link set dev $port master br0 done The above makes switching between ports on the same row be performed in hardware, and between ports on different rows in software. Now assume the Felix switch ports are called swp0, swp1, swp2. By running the following extra commands: ip link add dev br1 type bridge && ip link set dev br1 up for port in swp0 swp1 swp2; do ip link set dev $port master br1 done the CPU no longer sees packets which traverse sja1105 switch boundaries and can be forwarded directly by Felix. The br1 bridge would not be used for any sort of traffic termination. For this to work, we need to give drivers an opportunity to listen for bridging events on DSA trees other than their own, and pass that other tree index as argument. I have made the assumption, for the moment, that the other existing DSA notifiers don't need to be broadcast to other trees. That assumption might turn out to be incorrect. But in the meantime, introduce a dsa_broadcast function, similar in purpose to dsa_port_notify, which is used only by the bridging notifiers. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-11net: hns3: disable auto-negotiation off with 1000M setting in ethtoolYufeng Mo1-1/+6
The 802.3 specification does not specify the behavior of auto-negotiation off with 1000M in PHY. Therefore, some PHY compatibility issues occur. This patch forbids the setting of this unreasonable mode by ethtool in driver. Signed-off-by: Yufeng Mo <moyufeng@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-11net: hns3: optimized the judgment of the input parameters of dump ncl configYufeng Mo1-5/+10
This patch optimizes the judgment of the input parameters of dump ncl config by checking the number and value of the input parameters apart. It's clearer and more reasonable. Signed-off-by: Yufeng Mo <moyufeng@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-11net: hns3: provide .get_cmdq_stat interface for the clientHuazhong Tan2-0/+10
This patch provides a new interface for the client to query whether CMDQ is ready to work. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-11net: hns3: modify two uncorrect macro namesHuazhong Tan2-4/+4
According to the UM, command 0x0B03 and 0x0B13 are used to query the statistics about TX and RX, not the status, so modifies the unsuitable macro name of these two command. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-11net: hns3: remove a redundant register macro definitionHuazhong Tan2-8/+5
HCLGE_MISC_VECTOR_INT_STS and HCLGE_VECTOR_PF_OTHER_INT_STS_REG both represent the misc interrupt status register(0x20800), so removes HCLGE_VECTOR_PF_OTHER_INT_STS_REG and replaces it with HCLGE_MISC_VECTOR_INT_STS. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-10net: phy: Put interface into oper testing during cable testAndrew Lunn1-1/+11
Since running a cable test is disruptive, put the interface into operative state testing while the test is running. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-10net: phy: marvell: Add cable test supportAndrew Lunn1-0/+201
The Marvell PHYs have a couple of different register sets for performing cable tests. Page 7 provides the simplest to use. v3: s/mavell/marvell/g Remove include of <uapi/linux/ethtool_netlink.h> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-10net: ethtool: Add infrastructure for reporting cable test resultsAndrew Lunn1-2/+20
Provide infrastructure for PHY drivers to report the cable test results. A netlink skb is associated to the phydev. Helpers will be added which can add results to this skb. Once the test has finished the results are sent to user space. When netlink ethtool is not part of the kernel configuration stubs are provided. It is also impossible to trigger a cable test, so the error code returned by the alloc function is of no consequence. v2: Include the status complete in the netlink notification message v4: Replace -EINVAL with -EMSGSIZE Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-10net: phy: Add support for polling cable testAndrew Lunn1-0/+2
Some PHYs are not capable of generating interrupts when a cable test finished. They do however support interrupts for normal operations, like link up/down. As such, the PHY state machine would normally not poll the PHY. Add support for indicating the PHY state machine must poll the PHY when performing a cable test. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-10net: phy: Add cable test support to state machineAndrew Lunn1-0/+76
Running a cable test is desruptive to normal operation of the PHY and can take a 5 to 10 seconds to complete. The RTNL lock cannot be held for this amount of time, and add a new state to the state machine for running a cable test. The driver is expected to implement two functions. The first is used to start a cable test. Once the test has started, it should return. The second function is called once per second, or on interrupt to check if the cable test is complete, and to allow the PHY to report the status. v2: Rename phy_cable_test_abort to phy_abort_cable_test Return different extack when already running test Use phy_init_hw() to reset the PHY Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-10net: usb: qmi_wwan: remove redundant assignment to variable statusColin Ian King1-1/+1
The variable status is being initializeed with a value that is never read and it is being updated later with a new value. The initialization is redundant and can be removed. Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-10net: huawei_cdc_ncm: remove redundant assignment to variable retColin Ian King1-1/+1
The variable ret is being initializeed with a value that is never read and it is being updated later with a new value. The initialization is redundant and can be removed. Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-10net: usb: ax88179_178a: remove redundant assignment to variable retColin Ian King1-1/+1
The variable ret is being initializeed with a value that is never read and it is being updated later with a new value. The initialization is redundant and can be removed. Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-10dsa: sja1105: fix semicolon.cocci warningskbuild test robot1-1/+1
drivers/net/dsa/sja1105/sja1105_ethtool.c:481:11-12: Unneeded semicolon Remove unneeded semicolon. Generated by: scripts/coccinelle/misc/semicolon.cocci Fixes: ae1804de93f6 ("dsa: sja1105: dynamically allocate stats structure") CC: Arnd Bergmann <arnd@arndb.de> Signed-off-by: kbuild test robot <lkp@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-10octeontx2-pf: Use the napi_alloc_frag() to alloc the pool buffersKevin Hao4-50/+24
In the current codes, the octeontx2 uses its own method to allocate the pool buffers, but there are some issues in this implementation. 1. We have to run the otx2_get_page() for each allocation cycle and this is pretty error prone. As I can see there is no invocation of the otx2_get_page() in otx2_pool_refill_task(), this will leave the allocated pages have the wrong refcount and may be freed wrongly. 2. It wastes memory. For example, if we only receive one packet in a NAPI RX cycle, and then allocate a 2K buffer with otx2_alloc_rbuf() to refill the pool buffers and leave the remain area of the allocated page wasted. On a kernel with 64K page, 62K area is wasted. IMHO it is really unnecessary to implement our own method for the buffers allocate, we can reuse the napi_alloc_frag() to simplify our code. Signed-off-by: Kevin Hao <haokexin@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-10mlxsw: spectrum_flower: Forbid to insert flower rules in collision with ↵Jiri Pirko1-0/+32
matchall rules On ingress, the matchall rules doing mirroring and sampling are offloaded into hardware blocks that are processed before any flower rules. On egress, the matchall mirroring rules are offloaded into hardware block that is processed after all flower rules. Therefore check the priorities of inserted flower rules against existing matchall rules and ensure the correct ordering. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-10mlxsw: spectrum_matchall: Forbid to insert matchall rules in collision with ↵Jiri Pirko3-3/+41
flower rules On ingress, the matchall rules doing mirroring and sampling are offloaded into hardware blocks that are processed before any flower rules. On egress, the matchall mirroring rules are offloaded into hardware block that is processed after all flower rules. Therefore check the priorities of inserted matchall rules against existing flower rules and ensure the correct ordering. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-10mlxsw: spectrum_matchall: Expose a function to get min and max rule priorityJiri Pirko2-0/+38
Introduce an infrastructure that allows to get minimum and maximum rule priority for specified chain. This is going to be used by a subsequent patch to enforce ordering between flower and matchall filters. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-10mlxsw: spectrum_matchall: Put matchall list into substruct of flow structJiri Pirko3-7/+9
As there are going to be other matchall specific fields in flow structure, put the existing list field into matchall substruct. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-10mlxsw: spectrum_flower: Expose a function to get min and max rule priorityJiri Pirko5-7/+75
Introduce an infrastructure that allows to get minimum and maximum rule priority for specified chain. This is going to be used by a subsequent patch to enforce ordering between flower and matchall filters. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-10mlxsw: spectrum_matchall: Restrict sample action to be allowed only on ingressJiri Pirko1-0/+5
HW supports packet sampling on ingress only. Check and fail if user is adding sample on egress. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-10hinic: add three net_device_ops of vfLuo bin11-24/+453
adds ndo_set_vf_rate/ndo_set_vf_spoofchk/ndo_set_vf_link_state to configure netdev of virtual function Signed-off-by: Luo bin <luobin9@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-09Merge tag 'mlx5-updates-2020-05-09' of ↵Jakub Kicinski20-283/+456
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2020-05-09 This series includes updates to mlx5 netdev driver and bonding updates to support getting the next active tx slave. 1) merge commit with mlx5-next that includes bonding updates from Maor Bonding: Add support to get xmit slave 2) Maxim makes some general code improvements to TX data path 3) Tariq makes some general code improvements to kTLS and mlx5 accel layer in preparation for mlx5 TLS RX. ==================== Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-09net: atlantic: unify MAC generationMark Starovoytov3-44/+15
This patch unifies invalid MAC address handling with other drivers. Basically we've switched to using standard APIs (is_valid_ether_addr / eth_hw_addr_random) where possible. It's worth noting that some of engineering Aquantia NICs might be provisioned with a partially zeroed out MAC, which is still invalid, but not caught by is_valid_ether_addr(), so we've added a special handling for this case. Also adding a warning in case of fallback to random MAC, because this shouldn't be needed on production NICs, they should all be provisioned with unique MAC. NB! Default systemd/udevd configuration is 'MACAddressPolicy=persistent'. This causes MAC address to be persisted across driver reloads and reboots. We had to change it to 'none' for verification purposes. Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-09net: atlantic: remove check for boot code survivability before reset requestMark Starovoytov1-8/+0
This patch removes unnecessary check for boot code survivability before reset request. Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-09net: atlantic: remove hw_atl_b0_hw_rss_set call from A2 codeMark Starovoytov3-8/+7
No need to call hw_atl_b0_hw_rss_set from hw_atl2_hw_rss_set Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-09net: atlantic: remove TPO2 check from A0 codeMark Starovoytov1-2/+1
TPO2 was introduced in B0 only, no reason to check for it in A0 code. Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-09net: atlantic: rename AQ_NIC_RATE_2GS to AQ_NIC_RATE_2G5Mark Starovoytov10-42/+49
This patch changes the constant name to a more logical "2G5" (for 2.5G speeds). Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-09net: atlantic: minor MACSec code cleanupMark Starovoytov1-3/+3
This patch fixes a couple of minor merge issues found in macsec_api.c after corresponding patch series has been applied. These are not real bugs, so pushing to net-next. Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-09net: atlantic: use __packed instead of the full expansion.Mark Starovoytov1-2/+2
This patches fixes the review comment made by Jakub Kicinski in the "net: atlantic: A2 support" patch series. Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-09net/mlx5e: Enhance ICOSQ WQE info fieldsTariq Toukan3-11/+19
The same WQE opcode might be used in different ICOSQ flows and WQE types. To have a better distinguishability, replace it with an enum that better indicates the WQE type and flow it is used for. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-09net/mlx5: Accel, Remove unnecessary header includeTariq Toukan1-1/+0
The include of Ethernet driver header in core is not needed and actually wrong. Remove it. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-09net/mlx5e: Use struct assignment for WQE info updatesTariq Toukan4-17/+24
Struct assignment looks more clean, and implies resetting the not assigned fields to zero, instead of holding values from older ring cycles. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-09net/mlx5e: Take TX WQE info structures out of general EN headerTariq Toukan3-27/+27
Into the txrx header file. The mlx5e_sq_wqe_info structure describes WQE info for the ICOSQ, rename it to better reflect this. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-09net/mlx5e: kTLS, Do not fill edge for the DUMP WQEs in TX flowTariq Toukan1-4/+1
Every single DUMP WQE resides in a single WQEBB. As the pi is calculated per each one separately, there is no real need for a contiguous room for them, allow them to populate different WQ fragments. This reduces WQ waste and improves its utilization. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-09net/mlx5e: kTLS, Fill work queue edge separately in TX flowTariq Toukan1-10/+8
For the static and progress context params WQEs, do the edge filling separately. This improves the WQ utilization, code readability, and reduces the chance of future bugs. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-09net/mlx5e: Split TX acceleration offloads into two phasesMaxim Mikityanskiy8-23/+52
After previous modifications, the offloads are no longer called one by one, the pi is calculated and the wqe is cleared on between of TLS and IPSEC offloads, which doesn't quite fit mlx5e_accel_handle_tx's purpose. This patch splits mlx5e_accel_handle_tx into two functions that correspond to two logical phases of running offloads: 1. Before fetching a WQE. Here runs the code that can post WQEs on its own, before the main WQE is fetched. It's the main part of TLS offload. 2. After fetching a WQE. Here runs the code that updates the WQE's fields, but can't post other WQEs any more. It's a minor part of TLS offload that sets the tisn field in the cseg, and eseg-based offloads (currently IPSEC, and later patches will move GENEVE and checksum offloads there, too). It allows to make mlx5e_xmit take care of all actions needed to transmit a packet in the right order, improve the structure of the code and reduce unnecessary operations. The structure will be further improved in the following patches (all eseg-based offloads will be moved to a single place, and reserving space for the main WQE will happen between phase 1 and phase 2 of offloads to eliminate unneeded data movements). Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-09net/mlx5e: Update UDP fields of the SKB for GSO firstMaxim Mikityanskiy1-3/+5
mlx5e_udp_gso_handle_tx_skb updates the length field in the UDP header in case of GSO. It doesn't interfere with other offloads, so do it first to simplify further restructuring of the code. This way we'll make all independent modifications to the SKB before starting to work with WQEs. Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Raed Salem <raeds@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-09net/mlx5e: Make TLS offload independent of wqe and piMaxim Mikityanskiy5-23/+23
TLS offload may write a 32-bit field (tisn) to the cseg of the WQE. To do that, it receives pi and wqe pointers. As TLS offload may also send additional WQEs, it has to update pi and wqe, and in many cases it even doesn't use pi calculated before and wqe zeroed before and does it itself. Also, mlx5e_sq_xmit has to copy the whole cseg if it goes to the mlx5e_fill_sq_frag_edge flow. This all is not efficient. It's more efficient to do the following: 1. Just return tisn from TLS offload and make the caller fill it in a more appropriate place. 2. Calculate pi and clear wqe after calling TLS offload. 3. If TLS offload has to send WQEs, calculate pi and clear wqe just before that. It's already done in all places anyway, so this commit allows to remove some redundant memsets and calls. Copying of cseg will be eliminated in one of the following commits, and all other stuff is done here. Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-09net/mlx5e: Pass only eseg to IPSEC offloadMaxim Mikityanskiy3-4/+4
IPSEC offload needs to modify the eseg of the WQE that is being filled, but it receives a pointer to the whole WQE. To make the contract stricter, pass only the pointer to the eseg of that WQE. This commit is preparation for the following refactoring of offloads in the TX path and for the MPWQE support. Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-09net/mlx5e: Return void from mlx5e_sq_xmit and mlx5i_sq_xmitMaxim Mikityanskiy4-19/+18
mlx5e_sq_xmit and mlx5i_sq_xmit always return NETDEV_TX_OK. Drop the return value to simplify the code. Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-09net/mlx5e: Unify checks of TLS offloadsMaxim Mikityanskiy3-22/+13
Both INNOVA and ConnectX TLS offloads perform the same checks in the beginning. Unify them to reduce repeating code. Do WARN_ON_ONCE on netdev mismatch and finish with an error in both offloads, not only in the ConnectX one. Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-09net/mlx5e: Return bool from TLS and IPSEC offloadsMaxim Mikityanskiy8-68/+50
TLS and IPSEC offloads currently return struct sk_buff *, but the value is either NULL or the same skb that was passed as a parameter. Return bool instead to provide stronger guarantees to the calling code (it won't need to support handling a different SKB that could be potentially returned before this change) and to simplify restructuring this code in the following commits. Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>