summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)AuthorFilesLines
2021-12-29Merge branch '10GbE' of ↵Jakub Kicinski8-13/+15
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== 10GbE Intel Wired LAN Driver Updates 2021-12-28 Alexander Lobakin says: napi_build_skb() I introduced earlier this year ([0]) aims to decrease MM pressure and the overhead from in-place kmem_cache_alloc() on each Rx entry processing by decaching skbuff_heads from NAPI per-cpu cache filled prior to that by napi_consume_skb() (so it is sort of a direct shortcut for free -> mm -> alloc cycle). Currently, no in-tree drivers use it. Switch all Intel Ethernet drivers to it to get slight-to-medium perf boosts depending on the frame size. ice driver, 50 Gbps link, pktgen + XDP_PASS (local in) sample: frame_size/nthreads 64/42 128/20 256/8 512/4 1024/2 1532/1 net-next (Kpps) 46062 34654 18248 9830 5343 2714 series 47438 34708 18330 9875 5435 2777 increase 2.9% 0.15% 0.45% 0.46% 1.72% 2.32% Additionally, e1000's been switched to napi_consume_skb() as it's safe and works fine there, and there's no point in napi_build_skb() without paired NAPI cache feeding point. [0] https://lore.kernel.org/all/20210213141021.87840-1-alobakin@pm.me * '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: ixgbevf: switch to napi_build_skb() ixgbe: switch to napi_build_skb() igc: switch to napi_build_skb() igb: switch to napi_build_skb() ice: switch to napi_build_skb() iavf: switch to napi_build_skb() i40e: switch to napi_build_skb() e1000: switch to napi_build_skb() e1000: switch to napi_consume_skb() ==================== Link: https://lore.kernel.org/r/20211228175815.281449-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-29net: lantiq_etop: add blank line after declarationAleksander Jan Bajkowski1-0/+1
This patch adds a missing line after the declaration and fixes the checkpatch warning: WARNING: Missing a blank line after declarations + int desc; + for (desc = 0; desc < LTQ_DESC_NUM; desc++) Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl> Link: https://lore.kernel.org/r/20211228220031.71576-1-olek2@wp.pl Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-29net: lantiq_etop: add missing comment for wmb()Aleksander Jan Bajkowski1-0/+1
This patch adds the missing code comment for memory barrier call and fixes checkpatch warning: WARNING: memory barrier without comment + wmb(); Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl> Link: https://lore.kernel.org/r/20211228214910.70810-1-olek2@wp.pl Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-29r8169: don't use pci_irq_vector() in atomic contextThomas Gleixner1-7/+7
Since referenced change pci_irq_vector() can't be used in atomic context any longer. This conflicts with our usage of this function in rtl8169_netpoll(). Therefore store the interrupt number in struct rtl8169_private. Fixes: 495c66aca3da ("genirq/msi: Convert to new functions") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://lore.kernel.org/r/3cd24763-f307-78f5-76ed-a5fbf315fb28@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-28ixgbevf: switch to napi_build_skb()Alexander Lobakin1-1/+1
napi_build_skb() reuses per-cpu NAPI skbuff_head cache in order to save some cycles on freeing/allocating skbuff_heads on every new Rx or completed Tx. ixgbevf driver runs Tx completion polling cycle right before the Rx one and uses napi_consume_skb() to feed the cache with skbuff_heads of completed entries, so it's never empty and always warm at that moment. Switch to the napi_build_skb() to relax mm pressure on heavy Rx. Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-28ixgbe: switch to napi_build_skb()Alexander Lobakin1-1/+1
napi_build_skb() reuses per-cpu NAPI skbuff_head cache in order to save some cycles on freeing/allocating skbuff_heads on every new Rx or completed Tx. ixgbe driver runs Tx completion polling cycle right before the Rx one and uses napi_consume_skb() to feed the cache with skbuff_heads of completed entries, so it's never empty and always warm at that moment. Switch to the napi_build_skb() to relax mm pressure on heavy Rx. Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-28igc: switch to napi_build_skb()Alexander Lobakin1-1/+1
napi_build_skb() reuses per-cpu NAPI skbuff_head cache in order to save some cycles on freeing/allocating skbuff_heads on every new Rx or completed Tx. igc driver runs Tx completion polling cycle right before the Rx one and uses napi_consume_skb() to feed the cache with skbuff_heads of completed entries, so it's never empty and always warm at that moment. Switch to the napi_build_skb() to relax mm pressure on heavy Rx. Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Nechama Kraus <nechamax.kraus@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-28igb: switch to napi_build_skb()Alexander Lobakin1-1/+1
napi_build_skb() reuses per-cpu NAPI skbuff_head cache in order to save some cycles on freeing/allocating skbuff_heads on every new Rx or completed Tx. igb driver runs Tx completion polling cycle right before the Rx one and uses napi_consume_skb() to feed the cache with skbuff_heads of completed entries, so it's never empty and always warm at that moment. Switch to the napi_build_skb() to relax mm pressure on heavy Rx. Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-28ice: switch to napi_build_skb()Alexander Lobakin1-1/+1
napi_build_skb() reuses per-cpu NAPI skbuff_head cache in order to save some cycles on freeing/allocating skbuff_heads on every new Rx or completed Tx. ice driver runs Tx completion polling cycle right before the Rx one and uses napi_consume_skb() to feed the cache with skbuff_heads of completed entries, so it's never empty and always warm at that moment. Switch to the napi_build_skb() to relax mm pressure on heavy Rx. Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-28iavf: switch to napi_build_skb()Alexander Lobakin1-1/+1
napi_build_skb() reuses per-cpu NAPI skbuff_head cache in order to save some cycles on freeing/allocating skbuff_heads on every new Rx or completed Tx. iavf driver runs Tx completion polling cycle right before the Rx one and uses napi_consume_skb() to feed the cache with skbuff_heads of completed entries, so it's never empty and always warm at that moment. Switch to the napi_build_skb() to relax mm pressure on heavy Rx. Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-28i40e: switch to napi_build_skb()Alexander Lobakin1-1/+1
napi_build_skb() reuses per-cpu NAPI skbuff_head cache in order to save some cycles on freeing/allocating skbuff_heads on every new Rx or completed Tx. i40e driver runs Tx completion polling cycle right before the Rx one and uses napi_consume_skb() to feed the cache with skbuff_heads of completed entries, so it's never empty and always warm at that moment. Switch to the napi_build_skb() to relax mm pressure on heavy Rx. Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-28e1000: switch to napi_build_skb()Alexander Lobakin1-1/+1
napi_build_skb() reuses per-cpu NAPI skbuff_head cache in order to save some cycles on freeing/allocating skbuff_heads on every new Rx or completed Tx element. e1000 driver runs Tx completion polling cycle right before the Rx one. Now that e1000 uses napi_consume_skb() to put skbuff_heads of completed entries into the cache, it will never empty and always warm at that moment. Switch to the napi_build_skb() to relax mm pressure on heavy Rx and increase throughput. Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Tony Brelinski <tony.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-28e1000: switch to napi_consume_skb()Alexander Lobakin1-5/+7
In order to take the best from per-cpu NAPI skbuff_head caches and CPU cycles, let's switch from dev_kfree_skb_any(), which passes skb back to the mm layer, to napi_consume_skb(), which feeds those caches on non-zero budget instead (falls back to the former on 0). Do the replacement in e1000_unmap_and_free_tx_resource(). There are 4 call sites of this function throughout the driver: * e1000_clean_tx_ring(). Slowpath, process context, cleans the whole Tx ring on ifdown. Use budget of 0 here; * e1000_tx_map(). Hotpath, net Tx softirq, unmaps the buffers in case of error. Use 0 as well; * e1000_clean_tx_irq(). Hotpath, NAPI Tx completion polling cycle. As the driver doesn't count completed Tx entries towards the NAPI budget, just use the poll budget of 64 to utilize caches. Apart from being a preparation for switching to napi_build_skb(), this is useful on its own as well, as napi_consume_skb() flushes skb caches by batches of 32 instead of one-at-a-time. Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Tony Brelinski <tony.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-27net:Remove initialization of static variables to 0Wen Zhiwei1-3/+3
Delete the initialization of three static variables because it is meaningless. Signed-off-by: Wen Zhiwei <wenzhiwei@kylinos.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-27net: ethernet: ti: davinci_emac: Use platform_get_irq() to get the interruptLad Prabhakar1-27/+42
platform_get_resource(pdev, IORESOURCE_IRQ, ..) relies on static allocation of IRQ resources in DT core code, this causes an issue when using hierarchical interrupt domains using "interrupts" property in the node as this bypasses the hierarchical setup and messes up the irq chaining. In preparation for removal of static setup of IRQ resource from DT core code use platform_get_irq() for DT users only. While at it propagate error code in case request_irq() fails instead of returning -EBUSY. Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-27net: xilinx: emaclite: Use platform_get_irq() to get the interruptLad Prabhakar1-6/+3
platform_get_resource(pdev, IORESOURCE_IRQ, ..) relies on static allocation of IRQ resources in DT core code, this causes an issue when using hierarchical interrupt domains using "interrupts" property in the node as this bypasses the hierarchical setup and messes up the irq chaining. In preparation for removal of static setup of IRQ resource from DT core code use platform_get_irq(). Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-27net: ethoc: Use platform_get_irq() to get the interruptLad Prabhakar1-6/+3
platform_get_resource(pdev, IORESOURCE_IRQ, ..) relies on static allocation of IRQ resources in DT core code, this causes an issue when using hierarchical interrupt domains using "interrupts" property in the node as this bypasses the hierarchical setup and messes up the irq chaining. In preparation for removal of static setup of IRQ resource from DT core code use platform_get_irq(). Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-27fsl/fman: Use platform_get_irq() to get the interruptLad Prabhakar1-16/+16
platform_get_resource(pdev, IORESOURCE_IRQ, ..) relies on static allocation of IRQ resources in DT core code, this causes an issue when using hierarchical interrupt domains using "interrupts" property in the node as this bypasses the hierarchical setup and messes up the irq chaining. In preparation for removal of static setup of IRQ resource from DT core code use platform_get_irq(). While doing so return error pointer from read_dts_node() as platform_get_irq() may return -EPROBE_DEFER. Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-27net: pxa168_eth: Use platform_get_irq() to get the interruptLad Prabhakar1-4/+5
platform_get_resource(pdev, IORESOURCE_IRQ, ..) relies on static allocation of IRQ resources in DT core code, this causes an issue when using hierarchical interrupt domains using "interrupts" property in the node as this bypasses the hierarchical setup and messes up the irq chaining. In preparation for removal of static setup of IRQ resource from DT core code use platform_get_irq(). Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-27ethernet: netsec: Use platform_get_irq() to get the interruptLad Prabhakar1-7/+6
platform_get_resource(pdev, IORESOURCE_IRQ, ..) relies on static allocation of IRQ resources in DT core code, this causes an issue when using hierarchical interrupt domains using "interrupts" property in the node as this bypasses the hierarchical setup and messes up the irq chaining. In preparation for removal of static setup of IRQ resource from DT core code use platform_get_irq(). Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-27net: wwan: iosm: Keep device at D0 for s2idle caseKai-Heng Feng1-0/+3
We are seeing spurious wakeup caused by Intel 7560 WWAN on AMD laptops. This prevent those laptops to stay in s2idle state. >From what I can understand, the intention of ipc_pcie_suspend() is to put the device to D3cold, and ipc_pcie_suspend_s2idle() is to keep the device at D0. However, the device can still be put to D3hot/D3cold by PCI core. So explicitly let PCI core know this device should stay at D0, to solve the spurious wakeup. Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-27net: wwan: iosm: Let PCI core handle PCI power transitionKai-Heng Feng1-47/+2
pci_pm_suspend_noirq() and pci_pm_resume_noirq() already handle power transition for system-wide suspend and resume, so it's not necessary to do it in the driver. Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-27net: lan966x: Fix the vlan used by host portsHoratiu Vultur1-3/+3
The blamed commit changed the vlan used by the host ports to be 4095 instead of 0. Because of this change the following issues are seen: - when the port is probed first it was adding an entry in the MAC table with the wrong vlan (port->pvid which is default 0) and not HOST_PVID - when the port is removed from a bridge, it was using the wrong vlan to add entries in the MAC table. It was using the old PVID and not the HOST_PVID This patch fixes this two issues by using the HOST_PVID instead of port->pvid. Fixes: 6d2c186afa5d5d ("net: lan966x: Add vlan support.") Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-27bnxt_en: Use page frag RX buffers for better software GRO performanceJakub Kicinski1-12/+15
If NETIF_F_GRO_HW is disabled, the existing driver code uses kmalloc'ed data for RX buffers. This causes inefficient SW GRO performance because the GRO data is merged using the less efficient frag_list. Use netdev_alloc_frag() and friends instead so that GRO data can be merged into skb_shinfo(skb)->frags for better performance. [Use skb_free_frag() - Vikas Gupta] Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Andy Gospodarek <gospo@broadcom.com> Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-27bnxt_en: convert to xdp_do_flushEdwin Peer1-1/+1
The xdp_do_flush_map function has been replaced with the more general xdp_do_flush(). Signed-off-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-27bnxt_en: Support CQE coalescing mode in ethtoolMichael Chan1-1/+23
Support showing and setting the CQE mode in ethtool. Reviewed-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-27bnxt_en: Support configurable CQE coalescing modeMichael Chan2-5/+12
CQE coalescing mode is the same as the timer reset coalescing mode on Broadcom devices. Currently this mode is always enabled if it is supported by the device. Restructure the code slightly to support dynamically changing this mode. Add a flags field to struct bnxt_coal. Initially, the CQE flag will be set for the RX and TX side if the device supports it. We need to move bnxt_init_dflt_coal() to set up default coalescing until the capability is determined. Reviewed-by: Andy Gospodarek <gospo@broadcom.com> Reviewed-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-27bnxt_en: enable interrupt sampling on 5750X for DIMAndy Gospodarek1-1/+13
5750X (P5) chips handle receiving packets on the NQ rather than the main completion queue so we need to get and set stats from the correct spots for dynamic interrupt moderation. Signed-off-by: Andy Gospodarek <gospo@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-27bnxt_en: Log error report for dropped doorbellMichael Chan1-2/+8
Log the unrecognized error report type value as well. Reviewed-by: Andy Gospodarek <gospo@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-27bnxt_en: Add event handler for PAUSE Storm eventSomnath Kotur1-0/+3
FW has been modified to send a new async event when it detects a pause storm. Register for this new event and log it upon receipt. Reviewed-by: Andy Gospodarek <gospo@broadcom.com> Reviewed-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-24net: phy: micrel: Add config_init for LAN8814Horatiu Vultur1-0/+32
Add config_init for LAN8814. This function is required for the following reasons: - we need to make sure that the PHY is reset, - disable ANEG with QSGMII PCS Host side - swap the MDI-X A,B transmit so that there will not be any link flip-flaps when the PHY gets a link. Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-24net: wan/lmc: fix spelling of "its"Randy Dunlap1-1/+1
Use the possessive "its" instead of the contraction of "it is" ("it's") in user messages. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Krzysztof Halasa <khc@pm.waw.pl> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski23-147/+268
include/net/sock.h commit 8f905c0e7354 ("inet: fully convert sk->sk_rx_dst to RCU rules") commit 43f51df41729 ("net: move early demux fields close to sk_refcnt") https://lore.kernel.org/all/20211222141641.0caa0ab3@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-23net: stmmac: dwmac-visconti: Fix value of ETHER_CLK_SEL_FREQ_SEL_2P5MNobuhiro Iwamatsu1-1/+1
ETHER_CLK_SEL_FREQ_SEL_2P5M is not 0 bit of the register. This is a value, which is 0. Fix from BIT(0) to 0. Reported-by: Yuji Ishikawa <yuji2.ishikawa@toshiba.co.jp> Fixes: b38dd98ff8d0 ("net: stmmac: Add Toshiba Visconti SoCs glue driver") Signed-off-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp> Link: https://lore.kernel.org/r/20211223073633.101306-1-nobuhiro1.iwamatsu@toshiba.co.jp Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-23r8152: sync ocp baseHayes Wang1-4/+22
There are some chances that the actual base of hardware is different from the value recorded by driver, so we have to reset the variable of ocp_base to sync it. Set ocp_base to -1. Then, it would be updated and the new base would be set to the hardware next time. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-23r8152: fix the force speed doesn't work for RTL8156Hayes Wang1-0/+17
It needs to set mdio force mode. Otherwise, link off always occurs when setting force speed. Fixes: 195aae321c82 ("r8152: support new chips") Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-23net: stmmac: ptp: fix potentially overflowing expressionXiaoliang Yang1-1/+1
Convert the u32 variable to type u64 in a context where expression of type u64 is required to avoid potential overflow. Fixes: e9e3720002f6 ("net: stmmac: ptp: update tas basetime after ptp adjust") Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com> Link: https://lore.kernel.org/r/20211223073928.37371-1-xiaoliang.yang_1@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-23veth: ensure skb entering GRO are not cloned.Paolo Abeni1-2/+6
After commit d3256efd8e8b ("veth: allow enabling NAPI even without XDP"), if GRO is enabled on a veth device and TSO is disabled on the peer device, TCP skbs will go through the NAPI callback. If there is no XDP program attached, the veth code does not perform any share check, and shared/cloned skbs could enter the GRO engine. Ignat reported a BUG triggered later-on due to the above condition: [ 53.970529][ C1] kernel BUG at net/core/skbuff.c:3574! [ 53.981755][ C1] invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI [ 53.982634][ C1] CPU: 1 PID: 19 Comm: ksoftirqd/1 Not tainted 5.16.0-rc5+ #25 [ 53.982634][ C1] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 [ 53.982634][ C1] RIP: 0010:skb_shift+0x13ef/0x23b0 [ 53.982634][ C1] Code: ea 03 0f b6 04 02 48 89 fa 83 e2 07 38 d0 7f 08 84 c0 0f 85 41 0c 00 00 41 80 7f 02 00 4d 8d b5 d0 00 00 00 0f 85 74 f5 ff ff <0f> 0b 4d 8d 77 20 be 04 00 00 00 4c 89 44 24 78 4c 89 f7 4c 89 8c [ 53.982634][ C1] RSP: 0018:ffff8881008f7008 EFLAGS: 00010246 [ 53.982634][ C1] RAX: 0000000000000000 RBX: ffff8881180b4c80 RCX: 0000000000000000 [ 53.982634][ C1] RDX: 0000000000000002 RSI: ffff8881180b4d3c RDI: ffff88810bc9cac2 [ 53.982634][ C1] RBP: ffff8881008f70b8 R08: ffff8881180b4cf4 R09: ffff8881180b4cf0 [ 53.982634][ C1] R10: ffffed1022999e5c R11: 0000000000000002 R12: 0000000000000590 [ 53.982634][ C1] R13: ffff88810f940c80 R14: ffff88810f940d50 R15: ffff88810bc9cac0 [ 53.982634][ C1] FS: 0000000000000000(0000) GS:ffff888235880000(0000) knlGS:0000000000000000 [ 53.982634][ C1] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 53.982634][ C1] CR2: 00007ff5f9b86680 CR3: 0000000108ce8004 CR4: 0000000000170ee0 [ 53.982634][ C1] Call Trace: [ 53.982634][ C1] <TASK> [ 53.982634][ C1] tcp_sacktag_walk+0xaba/0x18e0 [ 53.982634][ C1] tcp_sacktag_write_queue+0xe7b/0x3460 [ 53.982634][ C1] tcp_ack+0x2666/0x54b0 [ 53.982634][ C1] tcp_rcv_established+0x4d9/0x20f0 [ 53.982634][ C1] tcp_v4_do_rcv+0x551/0x810 [ 53.982634][ C1] tcp_v4_rcv+0x22ed/0x2ed0 [ 53.982634][ C1] ip_protocol_deliver_rcu+0x96/0xaf0 [ 53.982634][ C1] ip_local_deliver_finish+0x1e0/0x2f0 [ 53.982634][ C1] ip_sublist_rcv_finish+0x211/0x440 [ 53.982634][ C1] ip_list_rcv_finish.constprop.0+0x424/0x660 [ 53.982634][ C1] ip_list_rcv+0x2c8/0x410 [ 53.982634][ C1] __netif_receive_skb_list_core+0x65c/0x910 [ 53.982634][ C1] netif_receive_skb_list_internal+0x5f9/0xcb0 [ 53.982634][ C1] napi_complete_done+0x188/0x6e0 [ 53.982634][ C1] gro_cell_poll+0x10c/0x1d0 [ 53.982634][ C1] __napi_poll+0xa1/0x530 [ 53.982634][ C1] net_rx_action+0x567/0x1270 [ 53.982634][ C1] __do_softirq+0x28a/0x9ba [ 53.982634][ C1] run_ksoftirqd+0x32/0x60 [ 53.982634][ C1] smpboot_thread_fn+0x559/0x8c0 [ 53.982634][ C1] kthread+0x3b9/0x490 [ 53.982634][ C1] ret_from_fork+0x22/0x30 [ 53.982634][ C1] </TASK> Address the issue by skipping the GRO stage for shared or cloned skbs. To reduce the chance of OoO, try to unclone the skbs before giving up. v1 -> v2: - use avoid skb_copy and fallback to netif_receive_skb - Eric Reported-by: Ignat Korchagin <ignat@cloudflare.com> Fixes: d3256efd8e8b ("veth: allow enabling NAPI even without XDP") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Tested-by: Ignat Korchagin <ignat@cloudflare.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/b5f61c5602aab01bac8d711d8d1bfab0a4817db7.1640197544.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-23Merge tag 'wireless-drivers-next-2021-12-23' of ↵Jakub Kicinski163-2635/+5359
git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next Kalle Valo says: ==================== wireless-drivers-next patches for v5.17 Third set of patches for v5.17, and the final one if all goes well. We have Specific Absorption Rate (SAR) support for both mt76 and rtw88. Also iwlwifi should be now W=1 warning free. But otherwise nothing really special this time, business as usual. Major changes: mt76 * Specific Absorption Rate (SAR) support * mt7921: new PCI ids * mt7921: 160 MHz channel support iwlwifi * fix W=1 and sparse warnings * BNJ hardware support * add new killer device ids * support for Time-Aware-SAR (TAS) from the BIOS * Optimized Connectivity Experience (OCE) scan support rtw88 * hardware scan * Specific Absorption Rate (SAR) support ath11k * qca6390/wcn6855: report signal and tx bitrate * qca6390: rfkill support * qca6390/wcn6855: regdb.bin support ath5k * switch to rate table based lookup wilc1000 * spi: reset/enable GPIO support * tag 'wireless-drivers-next-2021-12-23' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next: (148 commits) mt76: mt7921: fix a possible race enabling/disabling runtime-pm wilc1000: Document enable-gpios and reset-gpios properties wilc1000: Add reset/enable GPIO support to SPI driver wilc1000: Convert static "chipid" variable to device-local variable rtw89: 8852a: correct bit definition of dfs_en rtw88: don't consider deep PS mode when transmitting packet ath11k: Fix unexpected return buffer manager error for QCA6390 ath11k: add support of firmware logging for WCN6855 ath11k: Fix napi related hang ath10k: replace strlcpy with strscpy rtw88: support SAR via kernel common API rtw88: 8822c: add ieee80211_ops::hw_scan iwlwifi: mei: wait before mapping the shared area iwlwifi: mei: clear the ownership when the driver goes down iwlwifi: yoyo: fix issue with new DBGI_SRAM region read. iwlwifi: fw: fix some scan kernel-doc iwlwifi: pcie: make sure prph_info is set when treating wakeup IRQ iwlwifi: mvm: remove card state notification code iwlwifi: mvm: drop too short packets silently iwlwifi: mvm: fix AUX ROC removal ... ==================== Link: https://lore.kernel.org/r/20211223141108.78808C36AE9@smtp.kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-23net: stmmac: add tc flower filter for EtherType matchingOng Boon Leong2-0/+124
This patch adds basic support for EtherType RX frame steering for LLDP and PTP using the hardware offload capabilities. Example steps for setting up RX frame steering for LLDP and PTP: $ IFDEVNAME=eth0 $ tc qdisc add dev $IFDEVNAME ingress $ tc qdisc add dev $IFDEVNAME root mqprio num_tc 8 \ map 0 1 2 3 4 5 6 7 0 0 0 0 0 0 0 0 \ queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 hw 0 For LLDP $ tc filter add dev $IFDEVNAME parent ffff: protocol 0x88cc \ flower hw_tc 5 OR $ tc filter add dev $IFDEVNAME parent ffff: protocol LLDP \ flower hw_tc 5 For PTP $ tc filter add dev $IFDEVNAME parent ffff: protocol 0x88f7 \ flower hw_tc 6 Show tc ingress filter $ tc filter show dev $IFDEVNAME ingress v1->v2: Thanks to Kurt's and Sebastian's suggestion. - change from __be16 to u16 etype - change ETHER_TYPE_FULL_MASK to use cpu_to_be16() macro Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-23net: lan966x: Add support for multiple bridge flagsHoratiu Vultur4-2/+82
This patch series extends the current supported bridge flags with the following flags: BR_FLOOD, BR_BCAST_FLOOD and BR_LEARNING. Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-23mlxsw: spectrum_flower: Make vlan_id limitation more specificAmit Cohen1-1/+2
Spectrum ASICs do not support matching of VLAN ID at egress. Currently, mlxsw driver forbids matching of all VLAN related fields at egress, which is too strict check. For example, the following filter is not supported by the driver: $ tc filter add dev swpX egress protocol 802.1q pref 1 handle 101 flower vlan_ethtype ipv4 src_ip .. dst_ip .. skip_sw action pass Error: mlxsw_spectrum: vlan_id key is not supported on egress. We have an error talking to the kernel The filter above does not match on VLAN ID, but is bounced anyway. Make the check more specific, forbid only matching of 'vlan_id' at egress. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-23Merge tag 'mlx5-updates-2021-12-21' of ↵Jakub Kicinski18-79/+327
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2021-12-21 1) From Shay Drory: Devlink user knobs to control device's EQ size This series provides knobs which will enable users to minimize memory consumption of mlx5 Functions (PF/VF/SF). mlx5 exposes two new generic devlink params for EQ size configuration and uses devlink generic param max_macs. LINK: https://lore.kernel.org/netdev/20211208141722.13646-1-shayd@nvidia.com/ 2) From Tariq and Lama, allocate software channel objects and statistics of a mlx5 netdevice private data dynamically upon first demand to save on memory. * tag 'mlx5-updates-2021-12-21' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5e: Take packet_merge params directly from the RX res struct net/mlx5e: Allocate per-channel stats dynamically at first usage net/mlx5e: Use dynamic per-channel allocations in stats net/mlx5e: Allow profile-specific limitation on max num of channels net/mlx5e: Save memory by using dynamic allocation in netdev priv net/mlx5e: Add profile indications for PTP and QOS HTB features net/mlx5e: Use bitmap field for profile features net/mlx5: Remove the repeated declaration net/mlx5: Let user configure max_macs generic param devlink: Clarifies max_macs generic devlink param net/mlx5: Let user configure event_eq_size param devlink: Add new "event_eq_size" generic device param net/mlx5: Let user configure io_eq_size param devlink: Add new "io_eq_size" generic device param ==================== Link: https://lore.kernel.org/r/20211222031604.14540-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-23codel: remove unnecessary pkt_sched.h includeJakub Kicinski4-0/+6
Commit d068ca2ae2e6 ("codel: split into multiple files") moved all Qdisc-related code to codel_qdisc.h, move the include of pkt_sched.h as well. This is similar to the previous commit, although we don't care as much about incremental builds after pkt_sched.h was touched itself it is included by net/sch_generic.h which is modified ~20 times a year. This decreases the incremental build size after touching pkt_sched.h from 1592 to 617 objects. Fix unmasked missing includes in WiFi drivers. Acked-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20211221193941.3805147-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-23codel: remove unnecessary sock.h includeJakub Kicinski3-0/+6
Since sock.h is modified relatively often (60 times in the last 12 months) it seems worthwhile to decrease the incremental build work. CoDel's header includes net/inet_ecn.h which in turn includes net/sock.h. codel.h is itself included by mac80211 which is included by much of the WiFi stack and drivers. Removing the net/inet_ecn.h include from CoDel breaks the dependecy between WiFi and sock.h. Commit d068ca2ae2e6 ("codel: split into multiple files") moved all the code which actually needs ECN helpers out to net/codel_impl.h, the include can be moved there as well. This decreases the incremental build size after touching sock.h from 4999 objects to 4051 objects. Fix unmasked missing includes in WiFi drivers. Acked-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20211221193941.3805147-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-23net: broadcom: bcm4908enet: remove redundant variable bytesColin Ian King1-2/+0
The variable bytes is being used to summate slot lengths, however the value is never used afterwards. The summation is redundant so remove variable bytes. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Link: https://lore.kernel.org/r/20211222003937.727325-1-colin.i.king@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-23ice: trivial: fix odd indentingJesse Brandeburg1-3/+4
Fix an odd indent where some code was left indented, and causes smatch to warn: ice_log_pkg_init() warn: inconsistent indenting While here, for consistency, add a break after the default case. This commit has a Fixes: but we caught this while it was only in net-next. Fixes: 247dd97d713c ("ice: Refactor status flow for DDP load") Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Link: https://lore.kernel.org/r/20211221230538.2546315-1-jesse.brandeburg@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-23asix: fix wrong return value in asix_check_host_enable()Pavel Skripkin1-2/+4
If asix_read_cmd() returns 0 on 30th interation, 0 will be returned from asix_check_host_enable(), which is logically wrong. Fix it by returning -ETIMEDOUT explicitly if we have exceeded 30 iterations Also, replaced 30 with #define as suggested by Andrew Fixes: a786e3195d6a ("net: asix: fix uninit value bugs") Reported-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/ecd3470ce6c2d5697ac635d0d3b14a47defb4acb.1640117288.git.paskripkin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-23asix: fix uninit-value in asix_mdio_read()Pavel Skripkin1-1/+1
asix_read_cmd() may read less than sizeof(smsr) bytes and in this case smsr will be uninitialized. Fail log: BUG: KMSAN: uninit-value in asix_check_host_enable drivers/net/usb/asix_common.c:82 [inline] BUG: KMSAN: uninit-value in asix_check_host_enable drivers/net/usb/asix_common.c:82 [inline] drivers/net/usb/asix_common.c:497 BUG: KMSAN: uninit-value in asix_mdio_read+0x3c1/0xb00 drivers/net/usb/asix_common.c:497 drivers/net/usb/asix_common.c:497 asix_check_host_enable drivers/net/usb/asix_common.c:82 [inline] asix_check_host_enable drivers/net/usb/asix_common.c:82 [inline] drivers/net/usb/asix_common.c:497 asix_mdio_read+0x3c1/0xb00 drivers/net/usb/asix_common.c:497 drivers/net/usb/asix_common.c:497 Fixes: d9fe64e51114 ("net: asix: Add in_pm parameter") Reported-and-tested-by: syzbot+f44badb06036334e867a@syzkaller.appspotmail.com Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Link: https://lore.kernel.org/r/8966e3b514edf39857dd93603fc79ec02e000a75.1640117288.git.paskripkin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-23Merge branch '100GbE' of ↵Jakub Kicinski11-233/+4357
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== 100GbE Intel Wired LAN Driver Updates 2021-12-21 This series contains updates to ice driver only. Karol modifies the reset flow to correct issues with PTP reset. Jake extends PTP support for E822 based devices. This includes a few cleanup patches, that fix some minor issues. In addition, there are some slight refactors to ease the addition of E822 support, followed by adding the new hardware implementation ice_ptp_hw.c. There are a few major differences with E822 support compared to E810 support: *) The E822 device has a Clock Generation Unit which must be initialized in order to generate proper clock frequencies on the output that drives the PTP hardware clock registers *) The E822 PHY is a bit different and requires a more complex initialization procedure which must be rerun any time the link configuration changes. *) The E822 devices support enhanced timestamp calibration by making use of a process called Vernier offset measurement. This allows the hardware to measure phase offset related to the PHY clocks for Serdes and FEC, reducing the inaccuracy of the timestamp relative to the actual packet transmission and receipt. Making use of this requires data gathered from the first transmitted and received packets, and waiting for the PHY to complete the calibration measurements. This is done as part of a new kthread, ov_work. Note that to avoid delay in enabling timestamps, we start the PHY in 'bypass' mode which allows timestamps to be captured without the Vernier calibration measurement. Once the first packets have been sent and received, we then complete the calibration setup and exit bypass mode and begin using the more precise timestamps. According to the datasheet, timestamps without calibration data can be incorrect relative to actual receipt or transmission by up to 1 clock cycle (~1.25 nanoseconds), while calibrated timestamps should be correct to within 1/8th of a clock cycle (~0.15 nanoseconds). *) E822 devices support crosstimestamping via PCIe PTM, which we enable when available on the platform. There is a fair amount of logic required to perform PHY and CGU initialization, which is the vast majority of the new code, but it is fairly self contained within ice_ptp_hw.c, with the exception of monitoring for offset validity being handled by a kthread. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: ice: support crosstimestamping on E822 devices if supported ice: exit bypass mode once hardware finishes timestamp calibration ice: ensure the hardware Clock Generation Unit is configured ice: implement basic E822 PTP support ice: convert clk_freq capability into time_ref ice: introduce ice_ptp_init_phc function ice: use 'int err' instead of 'int status' in ice_ptp_hw.c ice: PTP: move setting of tstamp_config ice: introduce ice_base_incval function ice: Fix E810 PTP reset flow ==================== Link: https://lore.kernel.org/r/20211221174845.3063640-1-anthony.l.nguyen@intel.com Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>