summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2020-07-23lan743x: remove redundant initialization of variable current_head_indexColin Ian King1-2/+1
The variable current_head_index is being initialized with a value that is never read and it is being updated later with a new value. Replace the initialization of -1 with the latter assignment. Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-23enetc: Remove the imdio bus on PF probe bailoutClaudiu Manoil1-1/+9
enetc_imdio_remove() is missing from the enetc_pf_probe() bailout path. Not surprisingly because enetc_setup_serdes() is registering the imdio bus for internal purposes, and it's not obvious that enetc_imdio_remove() currently performs the teardown of enetc_setup_serdes(). To fix this, define enetc_teardown_serdes() to wrap enetc_imdio_remove() (improve code maintenance) and call it on bailout and remove paths. Fixes: 975d183ef0ca ("net: enetc: Initialize SerDes for SGMII and USXGMII protocols") Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-23net: qed: Remove unneeded cast from memory allocationWang Hai1-2/+1
Remove casting the values returned by memory allocation function. Coccinelle emits WARNING: casting value returned by memory allocation unction to (struct roce_destroy_qp_req_output_params *) is useless. This issue was detected by using the Coccinelle software. Signed-off-by: Wang Hai <wanghai38@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-23net: phy: fix check in get_phy_c45_idsVladimir Oltean1-2/+2
After the patch below, the iteration through the available MMDs is completely short-circuited, and devs_in_pkg remains set to the initial value of zero. Due to devs_in_pkg being zero, the rest of get_phy_c45_ids() is short-circuited too: the following loop never reaches below this point either (it executes "continue" for every device in package, failing to retrieve PHY ID for any of them): /* Now probe Device Identifiers for each device present. */ for (i = 1; i < num_ids; i++) { if (!(devs_in_pkg & (1 << i))) continue; So c45_ids->device_ids remains populated with zeroes. This causes an Aquantia AQR412 PHY (same as any C45 PHY would, in fact) to be probed by the Generic PHY driver. The issue seems to be a case of submitting partially committed work (and therefore testing something other than was submitted). The intention of the patch was to delay exiting the loop until one more condition is reached (the devs_in_pkg read from hardware is either 0, OR mostly f's). So fix the patch to reflect that. Tested with traffic on a LS1028A-QDS, the PHY is now probed correctly using the Aquantia driver. The devs_in_pkg bit field is set to 0xe000009a, and the MMDs that are present have the following IDs: [ 5.600772] libphy: get_phy_c45_ids: device_ids[1]=0x3a1b662 [ 5.618781] libphy: get_phy_c45_ids: device_ids[3]=0x3a1b662 [ 5.630797] libphy: get_phy_c45_ids: device_ids[4]=0x3a1b662 [ 5.654535] libphy: get_phy_c45_ids: device_ids[7]=0x3a1b662 [ 5.791723] libphy: get_phy_c45_ids: device_ids[29]=0x3a1b662 [ 5.804050] libphy: get_phy_c45_ids: device_ids[30]=0x3a1b662 [ 5.816375] libphy: get_phy_c45_ids: device_ids[31]=0x0 [ 7.690237] mscc_felix 0000:00:00.5: PHY [0.5:00] driver [Aquantia AQR412] (irq=POLL) [ 7.704739] mscc_felix 0000:00:00.5: PHY [0.5:01] driver [Aquantia AQR412] (irq=POLL) [ 7.718918] mscc_felix 0000:00:00.5: PHY [0.5:02] driver [Aquantia AQR412] (irq=POLL) [ 7.733044] mscc_felix 0000:00:00.5: PHY [0.5:03] driver [Aquantia AQR412] (irq=POLL) Fixes: bba238ed037c ("net: phy: continue searching for C45 MMDs even if first returned ffff:ffff") Reported-by: Colin King <colin.king@canonical.com> Reported-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-23net: dccp: Add SIOCOUTQ IOCTL support (send buffer fill)Richard Sailer2-0/+12
This adds support for the SIOCOUTQ IOCTL to get the send buffer fill of a DCCP socket, like UDP and TCP sockets already have. Regarding the used data field: DCCP uses per packet sequence numbers, not per byte, so sequence numbers can't be used like in TCP. sk_wmem_queued is not used by DCCP and always 0, even in test on highly congested paths. Therefore this uses sk_wmem_alloc like in UDP. Signed-off-by: Richard Sailer <richard_siegfried@systemli.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-23Merge branch 'Add-DSA-yaml-binding'David S. Miller3-256/+99
Kurt Kanzenbach says: ==================== Add DSA yaml binding as discussed [1] [2] it makes sense to add a DSA yaml binding. This is the second version and contains now two ways of specifying the switch ports: Either by "ports" or by "ethernet-ports". That is why the third patch also adjusts the DSA core for it. Tested in combination with the hellcreek.yaml file. Changes since v1: * Use select to not match unrelated switches * Allow ethernet-port(s) * List ethernet-controller properties * Include better description * Let dsa.txt refer to dsa.yaml Thanks, Kurt [1] - https://lkml.kernel.org/netdev/449f0a03-a91d-ae82-b31f-59dfd1457ec5@gmail.com/ [2] - https://lkml.kernel.org/netdev/20200710090618.28945-1-kurt@linutronix.de/ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-23net: dsa: of: Allow ethernet-ports as encapsulating nodeKurt Kanzenbach1-2/+6
Due to unified Ethernet Switch Device Tree Bindings allow for ethernet-ports as encapsulating node as well. Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-23dt-bindings: net: dsa: Let dsa.txt refer to dsa.yamlKurt Kanzenbach1-254/+1
The DSA bindings have been converted to YAML. Therefore, the old text style documentation should refer to that one. The text file can be removed completely once all the existing DSA switch bindings have been converted as well. Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-23dt-bindings: net: dsa: Add DSA yaml bindingKurt Kanzenbach1-0/+92
For future DSA drivers it makes sense to add a generic DSA yaml binding which can be used then. This was created using the properties from dsa.txt. It includes the ports and the dsa,member property. Suggested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22net: mscc: ocelot: fix non-initialized CPU port on VSC7514Vladimir Oltean1-14/+14
The VSC7514 is marketed as a 10-port switch, however it has 11 physical ports (0->10) in the block diagram: https://www.microsemi.com/product-directory/ethernet-switches/3992-vsc7514 (also in the device tree at arch/mips/boot/dts/mscc/ocelot.dtsi) Additionally, by architecture it has one more entry in the analyzer block, situated right after the physical ports, for the CPU port module. This is not a physical port, it only represents a channel for frame injection and extraction. That entry for the CPU port is at index 11 in the analyzer. When the register groups for QSYS_SWITCH_PORT_MODE, SYS_PORT_MODE and SYS_PAUSE_CFG are declared to be replicated 11 times, the 11th entry in the array of regfields is not initialized, so the CPU port module is not initialized either. The documentation of QSYS_SWITCH_PORT_MODE for VSC7514 also says that this register group is replicated 12 times, so this patch is simply reflecting that and not introducing any further inconsistency. Fixes: 886e1387c73d ("net: mscc: ocelot: convert QSYS_SWITCH_PORT_MODE and SYS_PORT_MODE to regfields") Fixes: 541132f0961a ("net: mscc: ocelot: convert SYS_PAUSE_CFG register access to regfield") Reported-by: Bryan Whitehead <bryan.whitehead@microchip.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22net: explicitly include <linux/compat.h> in net/core/sock.cChristoph Hellwig1-0/+1
The buildbot found a config where the header isn't already implicitly pulled in, so add an explicit include as well. Fixes: 8c918ffbbad4 ("net: remove compat_sock_common_{get,set}sockopt") Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller68-526/+4930
Alexei Starovoitov says: ==================== pull-request: bpf-next 2020-07-21 The following pull-request contains BPF updates for your *net-next* tree. We've added 46 non-merge commits during the last 6 day(s) which contain a total of 68 files changed, 4929 insertions(+), 526 deletions(-). The main changes are: 1) Run BPF program on socket lookup, from Jakub. 2) Introduce cpumap, from Lorenzo. 3) s390 JIT fixes, from Ilya. 4) teach riscv JIT to emit compressed insns, from Luke. 5) use build time computed BTF ids in bpf iter, from Yonghong. ==================== Purely independent overlapping changes in both filter.h and xdp.h Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22Merge branch 'ionic-updates'David S. Miller5-44/+89
Shannon Nelson says: ==================== ionic updates These are a few odd code tweaks to the ionic driver: FW defined MTU limits, remove unnecessary code, and other tidiness tweaks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22ionic: interface file updatesShannon Nelson1-20/+68
Add some new interface values and update a few more descriptions. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22ionic: rearrange reset and bus-master controlShannon Nelson1-5/+4
We can prevent potential incorrect DMA access attempts from the NIC by enabling bus-master after the reset, and by disabling bus-master earlier in cleanup. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22ionic: update eid test for overflowShannon Nelson1-1/+1
Fix up our comparison to better handle a potential (but largely unlikely) wrap around. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22ionic: remove unused ionic_coal_hw_to_usecShannon Nelson1-13/+0
Clean up some unused code. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22ionic: set netdev default nameShannon Nelson1-0/+1
If the host system's udev fails to set a new name for the network port, there is no NETDEV_CHANGENAME event to trigger the driver to send the name down to the firmware. It is safe to set the lif name multiple times, so we add a call early on to set the default netdev name to be sure the FW has something to use in its internal debug logging. Then when udev gets around to changing it we can update it to the actual name the system will be using. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22ionic: get MTU from lif identityShannon Nelson3-5/+15
Change from using hardcoded MTU limits and instead use the firmware defined limits. The value from the LIF attributes is the frame size, so we take off the header size to convert to MTU size. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22bareudp: Reverted support to enable & disable rx metadata collectionMartin Varghese4-22/+7
The commit fe80536acf83 ("bareudp: Added attribute to enable & disable rx metadata collection") breaks the the original(5.7) default behavior of bareudp module to collect RX metadadata at the receive. It was added to avoid the crash at the kernel neighbour subsytem when packet with metadata from bareudp is processed. But it is no more needed as the commit 394de110a733 ("net: Added pointer check for dst->ops->neigh_lookup in dst_neigh_lookup_skb") solves this crash. Fixes: fe80536acf83 ("bareudp: Added attribute to enable & disable rx metadata collection") Signed-off-by: Martin Varghese <martin.varghese@nokia.com> Acked-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22Merge branch 'dpaa2-eth-add-support-for-TBF-offload'David S. Miller5-7/+126
Ioana Ciornei says: ==================== dpaa2-eth: add support for TBF offload This patch set adds support for TBF offload in dpaa2-eth. The first patch restructures how the .ndo_setup_tc() callback is implemented (each Qdisc is treated in a separate function), the second patch just adds the necessary APIs for configuring the Tx shaper and the last one is handling TC_SETUP_QDISC_TBF and configures as requested the shaper. ==================== Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22dpaa2-eth: add support for TBF offloadIoana Ciornei2-1/+48
React to TC_SETUP_QDISC_TBF and configure the egress shaper as appropriate with the maximum rate and burst size requested by the user. TBF can only be offloaded on DPAA2 when it's the root qdisc, ie it's a per port shaper. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22dpaa2-eth: add API for Tx shapingIoana Ciornei3-0/+65
Add the necessary API (dpni_set_tx_shaping) for configuring the rate and burst size of a per port shaper in DPAA2. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22dpaa2-eth: move the mqprio setup into a separate functionIoana Ciornei1-6/+13
Move the setup done for MQPRIO into a separate function so that with the addition of another offload we do not crowd dpaa2_eth_setup_tc(). After this restructuring it's easier to see what is supported in terms of Qdisc offloading. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22mptcp: move helper to where its usedFlorian Westphal2-11/+12
Only used in token.c. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22Merge branch 'devlink-small-improvements'David S. Miller2-4/+6
Parav Pandit says: ==================== devlink small improvements This short series improves the devlink code for lock commment, simplifying checks and keeping the scope of mutex lock for necessary fields. Patch summary: Patch-1 Keep the devlink_mutex for only for necessary changes. Patch-2 Avoids duplicate check for reload flag Patch-3 Adds missing comment for the scope of devlink instance lock Patch-4 Constify devlink instance pointer ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22devlink: Constify devlink instance pointerParav Pandit1-1/+1
Constify devlink instance pointer while checking if reload operation is supported or not. This helps to review the scope of checks done in reload. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22devlink: Add comment for devlink instance lockParav Pandit1-1/+3
Add comment to describe the purpose of devlink instance lock. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22devlink: Avoid duplicate check for reload enabled flagParav Pandit1-1/+1
Reload operation is enabled or not is already checked by devlink_reload(). Hence, remove the duplicate check from devlink_nl_cmd_reload(). Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22devlink: Do not hold devlink mutex when initializing devlink fieldsParav Pandit1-1/+1
There is no need to hold a device global lock when initializing devlink device fields of a devlink instance which is not yet part of the devices list. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22r8169: allow to enable ASPM on RTL8125AHeiner Kallweit1-0/+2
For most chip versions this has been added already. Allow also for RTL8125A to enable ASPM. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22Merge branch 'ena-driver-new-features'David S. Miller9-103/+220
Arthur Kiyanovski says: ==================== ENA driver new features V4 changes: ----------- Add smp_rmb() to "net: ena: avoid unnecessary rearming of interrupt vector when busy-polling" to adhere to the linux kernel memory model, and update the commit message accordingly. V3 changes: ----------- 1. Add "net: ena: enable support of rss hash key and function changes" patch again, with more explanations why it should be in net-next in commit message. 2. Add synchronization considerations to "net: ena: avoid unnecessary rearming of interrupt vector when busy-polling" V2 changes: ----------- 1. Update commit messages of 2 patches to be more verbose. 2. Remove "net: ena: enable support of rss hash key and function changes" patch. Will be resubmitted net. V1 cover letter: ---------------- This patchset contains performance improvements, support for new devices and functionality: 1. Support for upcoming ENA devices 2. Avoid unnecessary IRQ unmasking in busy poll to reduce interrupt rate 3. Enabling device support for RSS function and key manipulation 4. Support for NIC-based traffic mirroring (SPAN port) 5. Additional PCI device ID 6. Cosmetic changes ==================== Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22net: ena: support new LLQ acceleration modeArthur Kiyanovski7-24/+109
New devices add a new hardware acceleration engine, which adds some restrictions to the driver. Metadata descriptor must be present for each packet and the maximum burst size between two doorbells is now limited to a number advertised by the device. This patch adds: 1. A handshake protocol between the driver and the device, so the device will enable the accelerated queues only when both sides support it. 2. The driver support for the new acceleration engine: 2.1. Send metadata descriptor for each Tx packet. 2.2. Limit the number of packets sent between doorbells.(*) (*) A previous driver implementation of this feature was comitted in commit 05d62ca218f8 ("net: ena: add handling of llq max tx burst size") however the design of the interface between the driver and device changed since then. This change is reflected in this commit. Signed-off-by: Netanel Belgazal <netanel@amazon.com> Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22net: ena: move llq configuration from ena_probe to ena_device_init()Arthur Kiyanovski1-63/+73
When the ENA device resets to recover from some error state, all LLQ configuration values are reset to their defaults, because LLQ is initialized only once during ena_probe(). Changes in this commit: 1. Move the LLQ configuration process into ena_init_device() which is called from both ena_probe() and ena_restore_device(). This way, LLQ setup configurations that are different from the default values will survive resets. 2. Extract the LLQ bar mapping to ena_map_llq_bar(), and call once in the lifetime of the driver from ena_probe(), since there is no need to unmap and map the LLQ bar again every reset. 3. Map the LLQ bar if it exists, regardless if initialization of LLQ placement policy (ENA_ADMIN_PLACEMENT_POLICY_DEV) succeeded or not. Initialization might fail the first time, falling back to the ENA_ADMIN_PLACEMENT_POLICY_HOST placement policy, but later succeed after device reset, in which case the LLQ bar needs to be mapped already. Signed-off-by: Sameeh Jubran <sameehj@amazon.com> Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22net: ena: enable support of rss hash key and function changesArthur Kiyanovski2-2/+6
Add the rss_configurable_function_key bit to driver_supported_feature. This bit tells the device that the driver in question supports the retrieving and updating of RSS function and hash key, and therefore the device should allow RSS function and key manipulation. This commit turns on device support for hash key and RSS function management. Without this commit this feature is turned off at the device and appears to the user as unsupported. This commit concludes the following series of already merged commits: commit 0af3c4e2eab8 ("net: ena: changes to RSS hash key allocation") commit c1bd17e51c71 ("net: ena: change default RSS hash function to Toeplitz") commit f66c2ea3b18a ("net: ena: allow setting the hash function without changing the key") commit e9a1de378dd4 ("net: ena: fix error returning in ena_com_get_hash_function()") commit 80f8443fcdaa ("net: ena: avoid unnecessary admin command when RSS function set fails") commit 6a4f7dc82d1e ("net: ena: rss: do not allocate key when not supported") commit 0d1c3de7b8c7 ("net: ena: fix incorrect default RSS key") The above commits represent the last part of the implementation of this feature, and with them merged the feature can be enabled in the device. Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22net: ena: add support for traffic mirroringArthur Kiyanovski2-7/+13
Add support for traffic mirroring, where the hardware reads the buffer from the instance memory directly. Traffic Mirroring needs access to the rx buffers in the instance. To have this access, this patch: 1. Changes the code to map and unmap the rx buffers bidirectionally. 2. Enables the relevant bit in driver_supported_features to indicate to the FW that this driver supports traffic mirroring. Rx completion is not generated until mirroring is done to avoid the situation where the driver changes the buffer before it is mirrored. Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22net: ena: cosmetic: change ena_com_stats_admin stats to u64Arthur Kiyanovski2-7/+7
The size of the admin statistics in ena_com_stats_admin is changed from 32bit to 64bit so to align with the sizes of the other statistics in the driver (i.e. rx_stats, tx_stats and ena_stats_dev). This is done as part of an effort to create a unified API to read statistics. Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22net: ena: cosmetic: satisfy gcc warningArthur Kiyanovski1-1/+1
gcc 4.8 reports a warning when initializing with = {0}. Dropping the "0" from the braces fixes the issue. This fix is not ANSI compatible but is allowed by gcc. Signed-off-by: Sameeh Jubran <sameehj@amazon.com> Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22net: ena: add reserved PCI device IDArthur Kiyanovski1-0/+5
Add a reserved PCI device ID to the driver's table Used for internal testing purposes. Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22net: ena: avoid unnecessary rearming of interrupt vector when busy-pollingArthur Kiyanovski2-1/+8
For an overview of the race created by this patch goto synchronization label. In napi busy-poll mode, the kernel invokes the napi handler of the device repeatedly to poll the NIC's receive queues. This process repeats until a timeout, specific for each connection, is up. By polling packets in busy-poll mode the user may gain lower latency and higher throughput (since the kernel no longer waits for interrupts to poll the queues) in expense of CPU usage. Upon completing a napi routine, the driver checks whether the routine was called by an interrupt handler. If so, the driver re-enables interrupts for the device. This is needed since an interrupt routine invocation disables future invocations until explicitly re-enabled. The driver avoids re-enabling the interrupts if they were not disabled in the first place (e.g. if driver in busy mode). Originally, the driver checked whether interrupt re-enabling is needed by reading the 'ena_napi->unmask_interrupt' variable. This atomic variable was set upon interrupt and cleared after re-enabling it. In the 4.10 Linux version, the 'napi_complete_done' call was changed so that it returns 'false' when device should not re-enable interrupts, and 'true' otherwise. The change includes reading the "NAPIF_STATE_IN_BUSY_POLL" flag to check if the napi call is in busy-poll mode, and if so, return 'false'. The driver was changed to re-enable interrupts according to this routine's return value. The Linux community rejected the use of the 'ena_napi->unmaunmask_interrupt' variable to determine whether unmasking is needed, and urged to use napi_napi_complete_done() return value solely. See https://lore.kernel.org/patchwork/patch/741149/ for more details As explained, a busy-poll session exists for a specified timeout value, after which it exits the busy-poll mode and re-enters it later. This leads to many invocations of the napi handler where napi_complete_done() false indicates that interrupts should be re-enabled. This creates a bug in which the interrupts are re-enabled unnecessarily. To reproduce this bug: 1) echo 50 | sudo tee /proc/sys/net/core/busy_poll 2) echo 50 | sudo tee /proc/sys/net/core/busy_read 3) Add counters that check whether 'ena_unmask_interrupt(tx_ring, rx_ring);' is called without disabling the interrupts in the first place (i.e. with calling the interrupt routine ena_intr_msix_io()) Steps 1+2 enable busy-poll as the default mode for new connections. The busy poll routine rearms the interrupts after every session by design, and so we need to add an extra check that the interrupts were masked in the first place. synchronization: This patch introduces a race between the interrupt handler ena_intr_msix_io() and the napi routine ena_io_poll(). Some macros and instruction were added to prevent this race from leaving the interrupts masked. The following specifies the different race scenarios in this patch: 1) interrupt handler and napi routine run sequentially i) interrupt handler is called, sets 'interrupts_masked' flag and successfully schedules the napi handler via softirq. In this scenario the napi routine might not see the flag change for several reasons: a) The flag is stored in a register by the compiler. For this case the WRITE_ONCE macro which prevents this. b) The compiler might reorder the instruction. For this the smp_wmb() instruction was used which implies a compiler memory barrier. c) On archs with weak consistency model (like ARM64) the napi routine might be scheduled and start running before the flag STORE instruction is committed to cache/memory. To ensure this doesn't happen, the smp_wmb() instruction was added. It ensures that the flag set instruction is committed before scheduling napi. ii) compiler reorders the flag's value check in the 'if' with the flag set in the napi routine. This scenario is prevented by smp_rmb() call after the flag check. 2) interrupt handler and napi routine run in parallel (can happen when busy poll routine invokes the napi handler) i) interrupt handler sets the flag in one core, while the napi routine reads it in another core. This scenario also is divided into two cases: a) napi_complete_done() doesn't finish running, in which case napi_sched() would just set NAPIF_STATE_MISSED and the napi routine would reschedule itself without changing the flag's value. b) napi_complete_done() finishes running. In this case the napi routine might override the flag's value. This doesn't present any rise since it later unmasks the interrupt vector. Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22qed: Fix ILT and XRCD bitmap memory leaksYuval Basson2-0/+6
- Free ILT lines used for XRC-SRQ's contexts. - Free XRCD bitmap Fixes: b8204ad878ce7 ("qed: changes to ILT to support XRC") Fixes: 7bfb399eca460 ("qed: Add XRC to RoCE") Signed-off-by: Michal Kalderon <mkalderon@marvell.com> Signed-off-by: Yuval Basson <ybason@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22Merge branch 'Phylink-PCS-updates'David S. Miller2-131/+344
Russell King says: ==================== Phylink PCS updates This series updates the rudimentary phylink PCS support with the results of the last four months of development of that. Phylink PCS support was initially added back at the end of March, when it became clear that the current approach of treating everything at the MAC end as being part of the MAC was inadequate. However, this rudimentary implementation was fine initially for mvneta and similar, but in practice had a fair number of issues, particularly when ethtool interfaces were used to change various link properties. It became apparent that relying on the phylink_config structure for the PCS was also bad when it became clear that the same PCS was used in DSA drivers as well as in NXPs other offerings, and there was a desire to re-use that code. It also became apparent that splitting the "configuration" step on an interface mode configuration between the MAC and PCS using just mac_config() and pcs_config() methods was not sufficient for some setups, as the MAC needed to be "taken down" prior to making changes, and once all settings were complete, the MAC could only then be resumed. This series addresses these points, progressing PCS support, and has been developed with mvneta and DPAA2 setups, with work on both those drivers to prove this approach. It has been rigorously tested with mvneta, as that provides the most flexibility for testing the various code paths. To solve the phylink_config reuse problem, we introduce a struct phylink_pcs, which contains the minimal information necessary, and it is intended that this is embedded in the PCS private data structure. To solve the interface mode configuration problem, we introduce two new MAC methods, mac_prepare() and mac_finish() which wrap the entire interface mode configuration only. This has the additional benefit of relieving MAC drivers from working out whether an interface change has occurred, and whether they need to do some major work. I have not yet updated all the interface documentation for these changes yet, that work remains, but this patch set is provided in the hope that those working on PCS support in NXP will find this useful. Since there is a lot of change here, this is the reason why I strongly advise that everyone has converted to the mac_link_up() way of configuring the link parameters when the link comes up, rather than the old way of using mac_config() - especially as splitting the PCS changes how and when phylink calls mac_config(). Although no change for existing users is intended, that is something I no longer am able to test. Changes since RFC: - fix bisect build failure - add patch to use config.an_enabled - rename phylink_config_interface to phylink_major_reconfig - add expanded documentation for phylink_set_pcs() ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22net: phylink: add interface to configure clause 22 PCS PHYRussell King2-0/+40
Add an interface to configure the advertisement for a clause 22 PCS PHY, and set the AN enable flag in the BMCR appropriately. Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22net: phylink: add struct phylink_pcsRussell King2-23/+56
Add a way for MAC PCS to have private data while keeping independence from struct phylink_config, which is used for the MAC itself. We need this independence as we will have stand-alone code for PCS that is independent of the MAC. Introduce struct phylink_pcs, which is designed to be embedded in a driver private data structure. This structure does not include a mdio_device as there are PCS implementations such as the Marvell DSA and network drivers where this is not necessary. Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22net: phylink: re-implement interface configuration with PCSRussell King2-23/+100
With PCS support, how we implement interface reconfiguration (or other major reconfiguration) is not up to the job; we end up reconfiguring the PCS for an interface change while the link could potentially be up. In order to solve this, add two additional MAC methods for major configuration, one to prepare for the change, and one to finish the change. This allows mvneta and mvpp2 to shutdown what they require prior to the MAC and PCS configuration calls, and then restart as appropriate. This impacts ksettings_set(), which now needs to identify whether the change is a minor tweak to the advertisement masks or whether the interface mode has changed, and call the appropriate function for that update. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22net: phylink: in-band pause mode advertisement update for PCSRussell King2-6/+56
Re-code the pause in-band advertisement update in light of the addition of PCS support, so that we perform the minimum required; only the PCS configuration function needs to be called in this case, followed by the request to trigger a restart of negotiation if the programmed advertisement changed. We need to change the pcs_config() signature to pass whether resolved pause should be passed to the MAC for setups such as mvneta and mvpp2 where doing so overrides the MAC manual flow controls. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22net: phylink: simplify fixed-link case for ksettings_set methodRussell King1-11/+20
For fixed links, we only allow the current settings, so this should be a matter of merely rejecting an attempt to change the settings. If the settings agree, then there is nothing more we need to do. Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22net: phylink: use config.an_enabled in ksettings_set methodRussell King1-2/+1
Rather than recomputing whether AN is enabled, use config.an_enabled. Suggested-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22net: phylink: simplify phy case for ksettings_set methodRussell King1-57/+47
When we have a PHY attached, an ethtool ksettings_set() call only really needs to call through to the phylib equivalent; phylib will call back to us when the link changes so we can update our state. Therefore, we can bypass most of our ksettings_set() call for this case. Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22net: phylink: simplify ksettings_set() implementationRussell King1-13/+12
Simplify the ksettings_set() implementation to look more like phylib's implementation; use a switch() for validating the autoneg setting, and use the linkmode_modify() helper to set the autoneg bit in the advertisement mask. Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>