summaryrefslogtreecommitdiff
path: root/drivers/net
AgeCommit message (Collapse)AuthorFilesLines
2022-02-12Merge tag 'usb-5.17-rc4' of ↵Linus Torvalds1-29/+39
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are some small USB driver fixes for 5.17-rc4 that resolve some reported issues and add new device ids: - usb-serial new device ids - ulpi cleanup fixes - f_fs use-after-free fix - dwc3 driver fixes - ax88179_178a usb network driver fix - usb gadget fixes There is a revert at the end of this series to resolve a build problem that 0-day found yesterday. Most of these have been in linux-next, except for the last few, and all have now passed 0-day tests" * tag 'usb-5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: Revert "usb: dwc2: drd: fix soft connect when gadget is unconfigured" usb: dwc2: drd: fix soft connect when gadget is unconfigured usb: gadget: rndis: check size of RNDIS_MSG_SET command USB: gadget: validate interface OS descriptor requests usb: core: Unregister device on component_add() failure net: usb: ax88179_178a: Fix out-of-bounds accesses in RX fixup usb: dwc3: gadget: Prevent core from processing stale TRBs USB: serial: cp210x: add CPI Bulk Coin Recycler id USB: serial: cp210x: add NCR Retail IO box id USB: serial: ftdi_sio: add support for Brainboxes US-159/235/320 usb: gadget: f_uac2: Define specific wTerminalType usb: gadget: udc: renesas_usb3: Fix host to USB_ROLE_NONE transition usb: raw-gadget: fix handling of dual-direction-capable endpoints usb: usb251xb: add boost-up property support usb: ulpi: Call of_node_put correctly usb: ulpi: Move of_node_put to ulpi_dev_release USB: serial: option: add ZTE MF286D modem USB: serial: ch341: add support for GW Instek USB2.0-Serial devices usb: f_fs: Fix use-after-free for epfile usb: dwc3: xilinx: fix uninitialized return value
2022-02-11net: usb: ax88179_178a: Fix out-of-bounds accesses in RX fixupJann Horn1-29/+39
ax88179_rx_fixup() contains several out-of-bounds accesses that can be triggered by a malicious (or defective) USB device, in particular: - The metadata array (hdr_off..hdr_off+2*pkt_cnt) can be out of bounds, causing OOB reads and (on big-endian systems) OOB endianness flips. - A packet can overlap the metadata array, causing a later OOB endianness flip to corrupt data used by a cloned SKB that has already been handed off into the network stack. - A packet SKB can be constructed whose tail is far beyond its end, causing out-of-bounds heap data to be considered part of the SKB's data. I have tested that this can be used by a malicious USB device to send a bogus ICMPv6 Echo Request and receive an ICMPv6 Echo Reply in response that contains random kernel heap data. It's probably also possible to get OOB writes from this on a little-endian system somehow - maybe by triggering skb_cow() via IP options processing -, but I haven't tested that. Fixes: e2ca90c276e1 ("ax88179_178a: ASIX AX88179_178A USB 3.0/2.0 to gigabit ethernet adapter driver") Cc: stable@kernel.org Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-10net: dsa: mv88e6xxx: fix use-after-free in mv88e6xxx_mdios_unregisterVladimir Oltean1-2/+2
Since struct mv88e6xxx_mdio_bus *mdio_bus is the bus->priv of something allocated with mdiobus_alloc_size(), this means that mdiobus_free(bus) will free the memory backing the mdio_bus as well. Therefore, the mdio_bus->list element is freed memory, but we continue to iterate through the list of MDIO buses using that list element. To fix this, use the proper list iterator that handles element deletion by keeping a copy of the list element next pointer. Fixes: f53a2ce893b2 ("net: dsa: mv88e6xxx: don't use devres for mdiobus") Reported-by: Rafael Richter <rafael.richter@gin.de> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20220210174017.3271099-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-10Merge branch '100GbE' of ↵Jakub Kicinski5-16/+53
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2022-02-10 Dan Carpenter propagates an error in FEC configuration. Jesse fixes TSO offloads of IPIP and SIT frames. Dave adds a dedicated LAG unregister function to resolve a KASAN error and moves auxiliary device re-creation after LAG removal to the service task to avoid issues with RTNL lock. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: ice: Avoid RTNL lock when re-creating auxiliary device ice: Fix KASAN error in LAG NETDEV_UNREGISTER handler ice: fix IPIP and SIT TSO offload ice: fix an error code in ice_cfg_phy_fec() ==================== Link: https://lore.kernel.org/r/20220210170515.2609656-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-10net: mscc: ocelot: fix mutex lock error during ethtool stats readColin Foster1-4/+7
An ongoing workqueue populates the stats buffer. At the same time, a user might query the statistics. While writing to the buffer is mutex-locked, reading from the buffer wasn't. This could lead to buggy reads by ethtool. This patch fixes the former blamed commit, but the bug was introduced in the latter. Signed-off-by: Colin Foster <colin.foster@in-advantage.com> Fixes: 1e1caa9735f90 ("ocelot: Clean up stats update deferred work") Fixes: a556c76adc052 ("net: mscc: Add initial Ocelot switch support") Reported-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/all/20220210150451.416845-2-colin.foster@in-advantage.com/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-10ice: Avoid RTNL lock when re-creating auxiliary deviceDave Ertman2-1/+5
If a call to re-create the auxiliary device happens in a context that has already taken the RTNL lock, then the call flow that recreates auxiliary device can hang if there is another attempt to claim the RTNL lock by the auxiliary driver. To avoid this, any call to re-create auxiliary devices that comes from an source that is holding the RTNL lock (e.g. netdev notifier when interface exits a bond) should execute in a separate thread. To accomplish this, add a flag to the PF that will be evaluated in the service task and dealt with there. Fixes: f9f5301e7e2d ("ice: Register auxiliary device to provide RDMA") Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Reviewed-by: Jonathan Toppins <jtoppins@redhat.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-02-10ice: Fix KASAN error in LAG NETDEV_UNREGISTER handlerDave Ertman1-6/+28
Currently, the same handler is called for both a NETDEV_BONDING_INFO LAG unlink notification as for a NETDEV_UNREGISTER call. This is causing a problem though, since the netdev_notifier_info passed has a different structure depending on which event is passed. The problem manifests as a call trace from a BUG: KASAN stack-out-of-bounds error. Fix this by creating a handler specific to NETDEV_UNREGISTER that only is passed valid elements in the netdev_notifier_info struct for the NETDEV_UNREGISTER event. Also included is the removal of an unbalanced dev_put on the peer_netdev and related braces. Fixes: 6a8b357278f5 ("ice: Respond to a NETDEV_UNREGISTER event for LAG") Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Acked-by: Jonathan Toppins <jtoppins@redhat.com> Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-02-10ice: fix IPIP and SIT TSO offloadJesse Brandeburg2-8/+18
The driver was avoiding offload for IPIP (at least) frames due to parsing the inner header offsets incorrectly when trying to check lengths. This length check works for VXLAN frames but fails on IPIP frames because skb_transport_offset points to the inner header in IPIP frames, which meant the subtraction of transport_header from inner_network_header returns a negative value (-20). With the code before this patch, everything continued to work, but GSO was being used to segment, causing throughputs of 1.5Gb/s per thread. After this patch, throughput is more like 10Gb/s per thread for IPIP traffic. Fixes: e94d44786693 ("ice: Implement filter sync, NDO operations and bump version") Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-02-10ice: fix an error code in ice_cfg_phy_fec()Dan Carpenter1-1/+2
Propagate the error code from ice_get_link_default_override() instead of returning success. Fixes: ea78ce4dab05 ("ice: add link lenient and default override support") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-02-10dpaa2-eth: unregister the netdev before disconnecting from the PHYRobert-Ionut Alexa1-2/+2
The netdev should be unregistered before we are disconnecting from the MAC/PHY so that the dev_close callback is called and the PHY and the phylink workqueues are actually stopped before we are disconnecting and destroying the phylink instance. Fixes: 719479230893 ("dpaa2-eth: add MAC/PHY support through phylink") Signed-off-by: Robert-Ionut Alexa <robert-ionut.alexa@nxp.com> Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-10net: macb: Align the dma and coherent dma masksMarc St-Amand1-1/+1
Single page and coherent memory blocks can use different DMA masks when the macb accesses physical memory directly. The kernel is clever enough to allocate pages that fit into the requested address width. When using the ARM SMMU, the DMA mask must be the same for single pages and big coherent memory blocks. Otherwise the translation tables turn into one big mess. [ 74.959909] macb ff0e0000.ethernet eth0: DMA bus error: HRESP not OK [ 74.959989] arm-smmu fd800000.smmu: Unhandled context fault: fsr=0x402, iova=0x3165687460, fsynr=0x20001, cbfrsynra=0x877, cb=1 [ 75.173939] macb ff0e0000.ethernet eth0: DMA bus error: HRESP not OK [ 75.173955] arm-smmu fd800000.smmu: Unhandled context fault: fsr=0x402, iova=0x3165687460, fsynr=0x20001, cbfrsynra=0x877, cb=1 Since using the same DMA mask does not hurt direct 1:1 physical memory mappings, this commit always aligns DMA and coherent masks. Signed-off-by: Marc St-Amand <mstamand@ciena.com> Signed-off-by: Harini Katakam <harini.katakam@xilinx.com> Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Tested-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-10net: usb: qmi_wwan: Add support for Dell DW5829eSlark Xiao1-0/+2
Dell DW5829e same as DW5821e except the CAT level. DW5821e supports CAT16 but DW5829e supports CAT9. Also, DW5829e includes normal and eSIM type. Please see below test evidence: T: Bus=04 Lev=01 Prnt=01 Port=01 Cnt=01 Dev#= 5 Spd=5000 MxCh= 0 D: Ver= 3.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS= 9 #Cfgs= 1 P: Vendor=413c ProdID=81e6 Rev=03.18 S: Manufacturer=Dell Inc. S: Product=DW5829e Snapdragon X20 LTE S: SerialNumber=0123456789ABCDEF C: #Ifs= 6 Cfg#= 1 Atr=a0 MxPwr=896mA I: If#=0x0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan I: If#=0x1 Alt= 0 #EPs= 1 Cls=03(HID ) Sub=00 Prot=00 Driver=usbhid I: If#=0x2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#=0x3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#=0x4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#=0x5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option T: Bus=04 Lev=01 Prnt=01 Port=01 Cnt=01 Dev#= 7 Spd=5000 MxCh= 0 D: Ver= 3.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS= 9 #Cfgs= 1 P: Vendor=413c ProdID=81e4 Rev=03.18 S: Manufacturer=Dell Inc. S: Product=DW5829e-eSIM Snapdragon X20 LTE S: SerialNumber=0123456789ABCDEF C: #Ifs= 6 Cfg#= 1 Atr=a0 MxPwr=896mA I: If#=0x0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan I: If#=0x1 Alt= 0 #EPs= 1 Cls=03(HID ) Sub=00 Prot=00 Driver=usbhid I: If#=0x2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#=0x3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#=0x4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#=0x5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option Signed-off-by: Slark Xiao <slark_xiao@163.com> Acked-by: Bjørn Mork <bjorn@mork.no> Link: https://lore.kernel.org/r/20220209024717.8564-1-slark_xiao@163.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-09net: amd-xgbe: disable interrupts during pci removalRaju Rangoju1-0/+3
Hardware interrupts are enabled during the pci probe, however, they are not disabled during pci removal. Disable all hardware interrupts during pci removal to avoid any issues. Fixes: e75377404726 ("amd-xgbe: Update PCI support to use new IRQ functions") Suggested-by: Selwin Sebastian <Selwin.Sebastian@amd.com> Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-09net: mdio: aspeed: Add missing MODULE_DEVICE_TABLEJoel Stanley1-0/+1
Fix loading of the driver when built as a module. Fixes: f160e99462c6 ("net: phy: Add mdio-aspeed") Signed-off-by: Joel Stanley <joel@jms.id.au> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-09veth: fix races around rq->rx_notify_maskedEric Dumazet1-5/+8
veth being NETIF_F_LLTX enabled, we need to be more careful whenever we read/write rq->rx_notify_masked. BUG: KCSAN: data-race in veth_xmit / veth_xmit write to 0xffff888133d9a9f8 of 1 bytes by task 23552 on cpu 0: __veth_xdp_flush drivers/net/veth.c:269 [inline] veth_xmit+0x307/0x470 drivers/net/veth.c:350 __netdev_start_xmit include/linux/netdevice.h:4683 [inline] netdev_start_xmit include/linux/netdevice.h:4697 [inline] xmit_one+0x105/0x2f0 net/core/dev.c:3473 dev_hard_start_xmit net/core/dev.c:3489 [inline] __dev_queue_xmit+0x86d/0xf90 net/core/dev.c:4116 dev_queue_xmit+0x13/0x20 net/core/dev.c:4149 br_dev_queue_push_xmit+0x3ce/0x430 net/bridge/br_forward.c:53 NF_HOOK include/linux/netfilter.h:307 [inline] br_forward_finish net/bridge/br_forward.c:66 [inline] NF_HOOK include/linux/netfilter.h:307 [inline] __br_forward+0x2e4/0x400 net/bridge/br_forward.c:115 br_flood+0x521/0x5c0 net/bridge/br_forward.c:242 br_dev_xmit+0x8b6/0x960 __netdev_start_xmit include/linux/netdevice.h:4683 [inline] netdev_start_xmit include/linux/netdevice.h:4697 [inline] xmit_one+0x105/0x2f0 net/core/dev.c:3473 dev_hard_start_xmit net/core/dev.c:3489 [inline] __dev_queue_xmit+0x86d/0xf90 net/core/dev.c:4116 dev_queue_xmit+0x13/0x20 net/core/dev.c:4149 neigh_hh_output include/net/neighbour.h:525 [inline] neigh_output include/net/neighbour.h:539 [inline] ip_finish_output2+0x6f8/0xb70 net/ipv4/ip_output.c:228 ip_finish_output+0xfb/0x240 net/ipv4/ip_output.c:316 NF_HOOK_COND include/linux/netfilter.h:296 [inline] ip_output+0xf3/0x1a0 net/ipv4/ip_output.c:430 dst_output include/net/dst.h:451 [inline] ip_local_out net/ipv4/ip_output.c:126 [inline] ip_send_skb+0x6e/0xe0 net/ipv4/ip_output.c:1570 udp_send_skb+0x641/0x880 net/ipv4/udp.c:967 udp_sendmsg+0x12ea/0x14c0 net/ipv4/udp.c:1254 inet_sendmsg+0x5f/0x80 net/ipv4/af_inet.c:819 sock_sendmsg_nosec net/socket.c:705 [inline] sock_sendmsg net/socket.c:725 [inline] ____sys_sendmsg+0x39a/0x510 net/socket.c:2413 ___sys_sendmsg net/socket.c:2467 [inline] __sys_sendmmsg+0x267/0x4c0 net/socket.c:2553 __do_sys_sendmmsg net/socket.c:2582 [inline] __se_sys_sendmmsg net/socket.c:2579 [inline] __x64_sys_sendmmsg+0x53/0x60 net/socket.c:2579 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae read to 0xffff888133d9a9f8 of 1 bytes by task 23563 on cpu 1: __veth_xdp_flush drivers/net/veth.c:268 [inline] veth_xmit+0x2d6/0x470 drivers/net/veth.c:350 __netdev_start_xmit include/linux/netdevice.h:4683 [inline] netdev_start_xmit include/linux/netdevice.h:4697 [inline] xmit_one+0x105/0x2f0 net/core/dev.c:3473 dev_hard_start_xmit net/core/dev.c:3489 [inline] __dev_queue_xmit+0x86d/0xf90 net/core/dev.c:4116 dev_queue_xmit+0x13/0x20 net/core/dev.c:4149 br_dev_queue_push_xmit+0x3ce/0x430 net/bridge/br_forward.c:53 NF_HOOK include/linux/netfilter.h:307 [inline] br_forward_finish net/bridge/br_forward.c:66 [inline] NF_HOOK include/linux/netfilter.h:307 [inline] __br_forward+0x2e4/0x400 net/bridge/br_forward.c:115 br_flood+0x521/0x5c0 net/bridge/br_forward.c:242 br_dev_xmit+0x8b6/0x960 __netdev_start_xmit include/linux/netdevice.h:4683 [inline] netdev_start_xmit include/linux/netdevice.h:4697 [inline] xmit_one+0x105/0x2f0 net/core/dev.c:3473 dev_hard_start_xmit net/core/dev.c:3489 [inline] __dev_queue_xmit+0x86d/0xf90 net/core/dev.c:4116 dev_queue_xmit+0x13/0x20 net/core/dev.c:4149 neigh_hh_output include/net/neighbour.h:525 [inline] neigh_output include/net/neighbour.h:539 [inline] ip_finish_output2+0x6f8/0xb70 net/ipv4/ip_output.c:228 ip_finish_output+0xfb/0x240 net/ipv4/ip_output.c:316 NF_HOOK_COND include/linux/netfilter.h:296 [inline] ip_output+0xf3/0x1a0 net/ipv4/ip_output.c:430 dst_output include/net/dst.h:451 [inline] ip_local_out net/ipv4/ip_output.c:126 [inline] ip_send_skb+0x6e/0xe0 net/ipv4/ip_output.c:1570 udp_send_skb+0x641/0x880 net/ipv4/udp.c:967 udp_sendmsg+0x12ea/0x14c0 net/ipv4/udp.c:1254 inet_sendmsg+0x5f/0x80 net/ipv4/af_inet.c:819 sock_sendmsg_nosec net/socket.c:705 [inline] sock_sendmsg net/socket.c:725 [inline] ____sys_sendmsg+0x39a/0x510 net/socket.c:2413 ___sys_sendmsg net/socket.c:2467 [inline] __sys_sendmmsg+0x267/0x4c0 net/socket.c:2553 __do_sys_sendmmsg net/socket.c:2582 [inline] __se_sys_sendmmsg net/socket.c:2579 [inline] __x64_sys_sendmmsg+0x53/0x60 net/socket.c:2579 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae value changed: 0x00 -> 0x01 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 23563 Comm: syz-executor.5 Not tainted 5.17.0-rc2-syzkaller-00064-gc36c04c2e132 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Fixes: 948d4f214fde ("veth: Add driver XDP") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-09nfp: flower: fix ida_idx not being releasedLouis Peens1-4/+8
When looking for a global mac index the extra NFP_TUN_PRE_TUN_IDX_BIT that gets set if nfp_flower_is_supported_bridge is true is not taken into account. Consequently the path that should release the ida_index in cleanup is never triggered, causing messages like: nfp 0000:02:00.0: nfp: Failed to offload MAC on br-ex. nfp 0000:02:00.0: nfp: Failed to offload MAC on br-ex. nfp 0000:02:00.0: nfp: Failed to offload MAC on br-ex. after NFP_MAX_MAC_INDEX number of reconfigs. Ultimately this lead to new tunnel flows not being offloaded. Fix this by unsetting the NFP_TUN_PRE_TUN_IDX_BIT before checking if the port is of type OTHER. Fixes: 2e0bc7f3cb55 ("nfp: flower: encode mac indexes with pre-tunnel rule check") Signed-off-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20220208101453.321949-1-simon.horman@corigine.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-09net: ethernet: litex: Add the dependency on HAS_IOMEMCai Huoqing1-1/+1
The LiteX driver uses devm io function API which needs HAS_IOMEM enabled, so add the dependency on HAS_IOMEM. Fixes: ee7da21ac4c3 ("net: Add driver for LiteX's LiteETH network interface") Signed-off-by: Cai Huoqing <cai.huoqing@linux.dev> Link: https://lore.kernel.org/r/20220208013308.6563-1-cai.huoqing@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-09ibmvnic: don't release napi in __ibmvnic_open()Sukadev Bhattiprolu1-4/+9
If __ibmvnic_open() encounters an error such as when setting link state, it calls release_resources() which frees the napi structures needlessly. Instead, have __ibmvnic_open() only clean up the work it did so far (i.e. disable napi and irqs) and leave the rest to the callers. If caller of __ibmvnic_open() is ibmvnic_open(), it should release the resources immediately. If the caller is do_reset() or do_hard_reset(), they will release the resources on the next reset. This fixes following crash that occurred when running the drmgr command several times to add/remove a vnic interface: [102056] ibmvnic 30000003 env3: Disabling rx_scrq[6] irq [102056] ibmvnic 30000003 env3: Disabling rx_scrq[7] irq [102056] ibmvnic 30000003 env3: Replenished 8 pools Kernel attempted to read user page (10) - exploit attempt? (uid: 0) BUG: Kernel NULL pointer dereference on read at 0x00000010 Faulting instruction address: 0xc000000000a3c840 Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries ... CPU: 9 PID: 102056 Comm: kworker/9:2 Kdump: loaded Not tainted 5.16.0-rc5-autotest-g6441998e2e37 #1 Workqueue: events_long __ibmvnic_reset [ibmvnic] NIP: c000000000a3c840 LR: c0080000029b5378 CTR: c000000000a3c820 REGS: c0000000548e37e0 TRAP: 0300 Not tainted (5.16.0-rc5-autotest-g6441998e2e37) MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 28248484 XER: 00000004 CFAR: c0080000029bdd24 DAR: 0000000000000010 DSISR: 40000000 IRQMASK: 0 GPR00: c0080000029b55d0 c0000000548e3a80 c0000000028f0200 0000000000000000 ... NIP [c000000000a3c840] napi_enable+0x20/0xc0 LR [c0080000029b5378] __ibmvnic_open+0xf0/0x430 [ibmvnic] Call Trace: [c0000000548e3a80] [0000000000000006] 0x6 (unreliable) [c0000000548e3ab0] [c0080000029b55d0] __ibmvnic_open+0x348/0x430 [ibmvnic] [c0000000548e3b40] [c0080000029bcc28] __ibmvnic_reset+0x500/0xdf0 [ibmvnic] [c0000000548e3c60] [c000000000176228] process_one_work+0x288/0x570 [c0000000548e3d00] [c000000000176588] worker_thread+0x78/0x660 [c0000000548e3da0] [c0000000001822f0] kthread+0x1c0/0x1d0 [c0000000548e3e10] [c00000000000cf64] ret_from_kernel_thread+0x5c/0x64 Instruction dump: 7d2948f8 792307e0 4e800020 60000000 3c4c01eb 384239e0 f821ffd1 39430010 38a0fff6 e92d1100 f9210028 39200000 <e9030010> f9010020 60420000 e9210020 ---[ end trace 5f8033b08fd27706 ]--- Fixes: ed651a10875f ("ibmvnic: Updated reset handling") Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Link: https://lore.kernel.org/r/20220208001918.900602-1-sukadev@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-09net: dsa: lantiq_gswip: don't use devres for mdiobusVladimir Oltean1-3/+11
As explained in commits: 74b6d7d13307 ("net: dsa: realtek: register the MDIO bus under devres") 5135e96a3dd2 ("net: dsa: don't allocate the slave_mii_bus using devres") mdiobus_free() will panic when called from devm_mdiobus_free() <- devres_release_all() <- __device_release_driver(), and that mdiobus was not previously unregistered. The GSWIP switch is a platform device, so the initial set of constraints that I thought would cause this (I2C or SPI buses which call ->remove on ->shutdown) do not apply. But there is one more which applies here. If the DSA master itself is on a bus that calls ->remove from ->shutdown (like dpaa2-eth, which is on the fsl-mc bus), there is a device link between the switch and the DSA master, and device_links_unbind_consumers() will unbind the GSWIP switch driver on shutdown. So the same treatment must be applied to all DSA switch drivers, which is: either use devres for both the mdiobus allocation and registration, or don't use devres at all. The gswip driver has the code structure in place for orderly mdiobus removal, so just replace devm_mdiobus_alloc() with the non-devres variant, and add manual free where necessary, to ensure that we don't let devres free a still-registered bus. Fixes: ac3a68d56651 ("net: phy: don't abuse devres in devm_mdiobus_register()") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-09net: dsa: mt7530: fix kernel bug in mdiobus_free() when unbindingVladimir Oltean1-1/+1
Nobody in this driver calls mdiobus_unregister(), which is necessary if mdiobus_register() completes successfully. So if the devres callbacks that free the mdiobus get invoked (this is the case when unbinding the driver), mdiobus_free() will BUG if the mdiobus is still registered, which it is. My speculation is that this is due to the fact that prior to commit ac3a68d56651 ("net: phy: don't abuse devres in devm_mdiobus_register()") from June 2020, _devm_mdiobus_free() used to call mdiobus_unregister(). But at the time that the mt7530 support was introduced in May 2021, the API was already changed. It's therefore likely that the blamed patch was developed on an older tree, and incorrectly adapted to net-next. This makes the Fixes: tag correct. Fix the problem by using the devres variant of mdiobus_register. Fixes: ba751e28d442 ("net: dsa: mt7530: add interrupt support") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-09net: dsa: seville: register the mdiobus under devresVladimir Oltean1-2/+3
As explained in commits: 74b6d7d13307 ("net: dsa: realtek: register the MDIO bus under devres") 5135e96a3dd2 ("net: dsa: don't allocate the slave_mii_bus using devres") mdiobus_free() will panic when called from devm_mdiobus_free() <- devres_release_all() <- __device_release_driver(), and that mdiobus was not previously unregistered. The Seville VSC9959 switch is a platform device, so the initial set of constraints that I thought would cause this (I2C or SPI buses which call ->remove on ->shutdown) do not apply. But there is one more which applies here. If the DSA master itself is on a bus that calls ->remove from ->shutdown (like dpaa2-eth, which is on the fsl-mc bus), there is a device link between the switch and the DSA master, and device_links_unbind_consumers() will unbind the seville switch driver on shutdown. So the same treatment must be applied to all DSA switch drivers, which is: either use devres for both the mdiobus allocation and registration, or don't use devres at all. The seville driver has a code structure that could accommodate both the mdiobus_unregister and mdiobus_free calls, but it has an external dependency upon mscc_miim_setup() from mdio-mscc-miim.c, which calls devm_mdiobus_alloc_size() on its behalf. So rather than restructuring that, and exporting yet one more symbol mscc_miim_teardown(), let's work with devres and replace of_mdiobus_register with the devres variant. When we use all-devres, we can ensure that devres doesn't free a still-registered bus (it either runs both callbacks, or none). Fixes: ac3a68d56651 ("net: phy: don't abuse devres in devm_mdiobus_register()") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-09net: dsa: felix: don't use devres for mdiobusVladimir Oltean1-1/+3
As explained in commits: 74b6d7d13307 ("net: dsa: realtek: register the MDIO bus under devres") 5135e96a3dd2 ("net: dsa: don't allocate the slave_mii_bus using devres") mdiobus_free() will panic when called from devm_mdiobus_free() <- devres_release_all() <- __device_release_driver(), and that mdiobus was not previously unregistered. The Felix VSC9959 switch is a PCI device, so the initial set of constraints that I thought would cause this (I2C or SPI buses which call ->remove on ->shutdown) do not apply. But there is one more which applies here. If the DSA master itself is on a bus that calls ->remove from ->shutdown (like dpaa2-eth, which is on the fsl-mc bus), there is a device link between the switch and the DSA master, and device_links_unbind_consumers() will unbind the felix switch driver on shutdown. So the same treatment must be applied to all DSA switch drivers, which is: either use devres for both the mdiobus allocation and registration, or don't use devres at all. The felix driver has the code structure in place for orderly mdiobus removal, so just replace devm_mdiobus_alloc_size() with the non-devres variant, and add manual free where necessary, to ensure that we don't let devres free a still-registered bus. Fixes: ac3a68d56651 ("net: phy: don't abuse devres in devm_mdiobus_register()") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-09net: dsa: bcm_sf2: don't use devres for mdiobusVladimir Oltean1-2/+5
As explained in commits: 74b6d7d13307 ("net: dsa: realtek: register the MDIO bus under devres") 5135e96a3dd2 ("net: dsa: don't allocate the slave_mii_bus using devres") mdiobus_free() will panic when called from devm_mdiobus_free() <- devres_release_all() <- __device_release_driver(), and that mdiobus was not previously unregistered. The Starfighter 2 is a platform device, so the initial set of constraints that I thought would cause this (I2C or SPI buses which call ->remove on ->shutdown) do not apply. But there is one more which applies here. If the DSA master itself is on a bus that calls ->remove from ->shutdown (like dpaa2-eth, which is on the fsl-mc bus), there is a device link between the switch and the DSA master, and device_links_unbind_consumers() will unbind the bcm_sf2 switch driver on shutdown. So the same treatment must be applied to all DSA switch drivers, which is: either use devres for both the mdiobus allocation and registration, or don't use devres at all. The bcm_sf2 driver has the code structure in place for orderly mdiobus removal, so just replace devm_mdiobus_alloc() with the non-devres variant, and add manual free where necessary, to ensure that we don't let devres free a still-registered bus. Fixes: ac3a68d56651 ("net: phy: don't abuse devres in devm_mdiobus_register()") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-09net: dsa: ar9331: register the mdiobus under devresVladimir Oltean1-2/+1
As explained in commits: 74b6d7d13307 ("net: dsa: realtek: register the MDIO bus under devres") 5135e96a3dd2 ("net: dsa: don't allocate the slave_mii_bus using devres") mdiobus_free() will panic when called from devm_mdiobus_free() <- devres_release_all() <- __device_release_driver(), and that mdiobus was not previously unregistered. The ar9331 is an MDIO device, so the initial set of constraints that I thought would cause this (I2C or SPI buses which call ->remove on ->shutdown) do not apply. But there is one more which applies here. If the DSA master itself is on a bus that calls ->remove from ->shutdown (like dpaa2-eth, which is on the fsl-mc bus), there is a device link between the switch and the DSA master, and device_links_unbind_consumers() will unbind the ar9331 switch driver on shutdown. So the same treatment must be applied to all DSA switch drivers, which is: either use devres for both the mdiobus allocation and registration, or don't use devres at all. The ar9331 driver doesn't have a complex code structure for mdiobus removal, so just replace of_mdiobus_register with the devres variant in order to be all-devres and ensure that we don't free a still-registered bus. Fixes: ac3a68d56651 ("net: phy: don't abuse devres in devm_mdiobus_register()") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-09net: dsa: mv88e6xxx: don't use devres for mdiobusVladimir Oltean1-3/+8
As explained in commits: 74b6d7d13307 ("net: dsa: realtek: register the MDIO bus under devres") 5135e96a3dd2 ("net: dsa: don't allocate the slave_mii_bus using devres") mdiobus_free() will panic when called from devm_mdiobus_free() <- devres_release_all() <- __device_release_driver(), and that mdiobus was not previously unregistered. The mv88e6xxx is an MDIO device, so the initial set of constraints that I thought would cause this (I2C or SPI buses which call ->remove on ->shutdown) do not apply. But there is one more which applies here. If the DSA master itself is on a bus that calls ->remove from ->shutdown (like dpaa2-eth, which is on the fsl-mc bus), there is a device link between the switch and the DSA master, and device_links_unbind_consumers() will unbind the Marvell switch driver on shutdown. systemd-shutdown[1]: Powering off. mv88e6085 0x0000000008b96000:00 sw_gl0: Link is Down fsl-mc dpbp.9: Removing from iommu group 7 fsl-mc dpbp.8: Removing from iommu group 7 ------------[ cut here ]------------ kernel BUG at drivers/net/phy/mdio_bus.c:677! Internal error: Oops - BUG: 0 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 1 Comm: systemd-shutdow Not tainted 5.16.5-00040-gdc05f73788e5 #15 pc : mdiobus_free+0x44/0x50 lr : devm_mdiobus_free+0x10/0x20 Call trace: mdiobus_free+0x44/0x50 devm_mdiobus_free+0x10/0x20 devres_release_all+0xa0/0x100 __device_release_driver+0x190/0x220 device_release_driver_internal+0xac/0xb0 device_links_unbind_consumers+0xd4/0x100 __device_release_driver+0x4c/0x220 device_release_driver_internal+0xac/0xb0 device_links_unbind_consumers+0xd4/0x100 __device_release_driver+0x94/0x220 device_release_driver+0x28/0x40 bus_remove_device+0x118/0x124 device_del+0x174/0x420 fsl_mc_device_remove+0x24/0x40 __fsl_mc_device_remove+0xc/0x20 device_for_each_child+0x58/0xa0 dprc_remove+0x90/0xb0 fsl_mc_driver_remove+0x20/0x5c __device_release_driver+0x21c/0x220 device_release_driver+0x28/0x40 bus_remove_device+0x118/0x124 device_del+0x174/0x420 fsl_mc_bus_remove+0x80/0x100 fsl_mc_bus_shutdown+0xc/0x1c platform_shutdown+0x20/0x30 device_shutdown+0x154/0x330 kernel_power_off+0x34/0x6c __do_sys_reboot+0x15c/0x250 __arm64_sys_reboot+0x20/0x30 invoke_syscall.constprop.0+0x4c/0xe0 do_el0_svc+0x4c/0x150 el0_svc+0x24/0xb0 el0t_64_sync_handler+0xa8/0xb0 el0t_64_sync+0x178/0x17c So the same treatment must be applied to all DSA switch drivers, which is: either use devres for both the mdiobus allocation and registration, or don't use devres at all. The Marvell driver already has a good structure for mdiobus removal, so just plug in mdiobus_free and get rid of devres. Fixes: ac3a68d56651 ("net: phy: don't abuse devres in devm_mdiobus_register()") Reported-by: Rafael Richter <Rafael.Richter@gin.de> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Daniel Klauer <daniel.klauer@gin.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-09bonding: pair enable_port with slave_arr_updatesMahesh Bandewar1-1/+2
When 803.2ad mode enables a participating port, it should update the slave-array. I have observed that the member links are participating and are part of the active aggregator while the traffic is egressing via only one member link (in a case where two links are participating). Via kprobes I discovered that slave-arr has only one link added while the other participating link wasn't part of the slave-arr. I couldn't see what caused that situation but the simple code-walk through provided me hints that the enable_port wasn't always associated with the slave-array update. Fixes: ee6377147409 ("bonding: Simplify the xmit function for modes that use xmit_hash") Signed-off-by: Mahesh Bandewar <maheshb@google.com> Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com> Link: https://lore.kernel.org/r/20220207222901.1795287-1-maheshb@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-09gve: Recording rx queue before sending to napiTao Liu1-0/+1
This caused a significant performance degredation when using generic XDP with multiple queues. Fixes: f5cedc84a30d2 ("gve: Add transmit and receive support") Signed-off-by: Tao Liu <xliutaox@google.com> Link: https://lore.kernel.org/r/20220207175901.2486596-1-jeroendb@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-08net: phy: marvell: Fix RGMII Tx/Rx delays setting in 88e1121-compatible PHYsPavel Parkhomenko1-4/+6
It is mandatory for a software to issue a reset upon modifying RGMII Receive Timing Control and RGMII Transmit Timing Control bit fields of MAC Specific Control register 2 (page 2, register 21) otherwise the changes won't be perceived by the PHY (the same is applicable for a lot of other registers). Not setting the RGMII delays on the platforms that imply it' being done on the PHY side will consequently cause the traffic loss. We discovered that the denoted soft-reset is missing in the m88e1121_config_aneg() method for the case if the RGMII delays are modified but the MDIx polarity isn't changed or the auto-negotiation is left enabled, thus causing the traffic loss on our platform with Marvell Alaska 88E1510 installed. Let's fix that by issuing the soft-reset if the delays have been actually set in the m88e1121_config_aneg_rgmii_delays() method. Cc: stable@vger.kernel.org Fixes: d6ab93364734 ("net: phy: marvell: Avoid unnecessary soft reset") Signed-off-by: Pavel Parkhomenko <Pavel.Parkhomenko@baikalelectronics.ru> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Serge Semin <fancer.lancer@gmail.com> Link: https://lore.kernel.org/r/20220205203932.26899-1-Pavel.Parkhomenko@baikalelectronics.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-06net: phy: marvell: Fix MDI-x polarity setting in 88e1118-compatible PHYsPavel Parkhomenko1-4/+3
When setting up autonegotiation for 88E1118R and compatible PHYs, a software reset of PHY is issued before setting up polarity. This is incorrect as changes of MDI Crossover Mode bits are disruptive to the normal operation and must be followed by a software reset to take effect. Let's patch m88e1118_config_aneg() to fix the issue mentioned before by invoking software reset of the PHY just after setting up MDI-x polarity. Fixes: 605f196efbf8 ("phy: Add support for Marvell 88E1118 PHY") Signed-off-by: Pavel Parkhomenko <Pavel.Parkhomenko@baikalelectronics.ru> Reviewed-by: Serge Semin <fancer.lancer@gmail.com> Suggested-by: Andrew Lunn <andrew@lunn.ch> Cc: stable@vger.kernel.org Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-05net: mscc: ocelot: fix all IP traffic getting trapped to CPU with PTP over IPVladimir Oltean1-0/+8
The filters for the PTP trap keys are incorrectly configured, in the sense that is2_entry_set() only looks at trap->key.ipv4.dport or trap->key.ipv6.dport if trap->key.ipv4.proto or trap->key.ipv6.proto is set to IPPROTO_TCP or IPPROTO_UDP. But we don't do that, so is2_entry_set() goes through the "else" branch of the IP protocol check, and ends up installing a rule for "Any IP protocol match" (because msk is also 0). The UDP port is ignored. This means that when we run "ptp4l -i swp0 -4", all IP traffic is trapped to the CPU, which hinders bridging. Fix this by specifying the IP protocol in the VCAP IS2 filters for PTP over UDP. Fixes: 96ca08c05838 ("net: mscc: ocelot: set up traps for PTP packets") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-04ixgbevf: Require large buffers for build_skb on 82599VFSamuel Mendoza-Jonas1-6/+7
From 4.17 onwards the ixgbevf driver uses build_skb() to build an skb around new data in the page buffer shared with the ixgbe PF. This uses either a 2K or 3K buffer, and offsets the DMA mapping by NET_SKB_PAD + NET_IP_ALIGN. When using a smaller buffer RXDCTL is set to ensure the PF does not write a full 2K bytes into the buffer, which is actually 2K minus the offset. However on the 82599 virtual function, the RXDCTL mechanism is not available. The driver attempts to work around this by using the SET_LPE mailbox method to lower the maximm frame size, but the ixgbe PF driver ignores this in order to keep the PF and all VFs in sync[0]. This means the PF will write up to the full 2K set in SRRCTL, causing it to write NET_SKB_PAD + NET_IP_ALIGN bytes past the end of the buffer. With 4K pages split into two buffers, this means it either writes NET_SKB_PAD + NET_IP_ALIGN bytes past the first buffer (and into the second), or NET_SKB_PAD + NET_IP_ALIGN bytes past the end of the DMA mapping. Avoid this by only enabling build_skb when using "large" buffers (3K). These are placed in each half of an order-1 page, preventing the PF from writing past the end of the mapping. [0]: Technically it only ever raises the max frame size, see ixgbe_set_vf_lpe() in ixgbe_sriov.c Fixes: f15c5ba5b6cd ("ixgbevf: add support for using order 1 pages to receive large frames") Signed-off-by: Samuel Mendoza-Jonas <samjonas@amazon.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-04net: sparx5: Fix get_stat64 crash in tcpdumpSteen Hegelund1-1/+1
This problem was found with Sparx5 when the tcpdump tool requests the do_get_stats64 (sparx5_get_stats64) statistic. The portstats pointer was incorrectly incremented when fetching priority based statistics. Fixes: af4b11022e2d (net: sparx5: add ethtool configuration and statistics support) Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com> Link: https://lore.kernel.org/r/20220203102900.528987-1-steen.hegelund@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-04net: stmmac: ensure PTP time register reads are consistentYannick Vignon1-7/+12
Even if protected from preemption and interrupts, a small time window remains when the 2 register reads could return inconsistent values, each time the "seconds" register changes. This could lead to an about 1-second error in the reported time. Add logic to ensure the "seconds" and "nanoseconds" values are consistent. Fixes: 92ba6888510c ("stmmac: add the support for PTP hw clock driver") Signed-off-by: Yannick Vignon <yannick.vignon@nxp.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://lore.kernel.org/r/20220203160025.750632-1-yannick.vignon@oss.nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-03net: ipa: request IPA register values be retainedAlex Elder3-0/+64
In some cases, the IPA hardware needs to request the always-on subsystem (AOSS) to coordinate with the IPA microcontroller to retain IPA register values at power collapse. This is done by issuing a QMP request to the AOSS microcontroller. A similar request ondoes that request. We must get and hold the "QMP" handle early, because we might get back EPROBE_DEFER for that. But the actual request should be sent while we know the IPA clock is active, and when we know the microcontroller is operational. Fixes: 1aac309d3207 ("net: ipa: use autosuspend") Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-02net: sparx5: do not refer to skb after passing it onSteen Hegelund1-1/+1
Do not try to use any SKB fields after the packet has been passed up in the receive stack. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com> Link: https://lore.kernel.org/r/20220202083039.3774851-1-steen.hegelund@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-02Merge tag 'mlx5-fixes-2022-02-01' of ↵David S. Miller16-53/+98
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5 fixes 2022-02-01 This series provides bug fixes to mlx5 driver. Please pull and let me know if there is any problem. Sorry about the long series, but I had to move the top two patches from net-next to net to help avoiding a build break when kspp branch is merged into linus-next on next merge window. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-02Merge branch '1GbE' of ↵Jakub Kicinski3-19/+44
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2022-02-01 This series contains updates to e1000e driver only. Sasha removes CSME handshake with TGL platform as this is not supported and is causing hardware unit hangs to be reported. * '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: e1000e: Handshake with CSME starts from ADL platforms e1000e: Separate ADP board type from TGP ==================== Link: https://lore.kernel.org/r/20220201173754.580305-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-02net/mlx5e: Avoid field-overflowing memcpy()Kees Cook2-4/+6
In preparation for FORTIFY_SOURCE performing compile-time and run-time field bounds checking for memcpy(), memmove(), and memset(), avoid intentionally writing across neighboring fields. Use flexible arrays instead of zero-element arrays (which look like they are always overflowing) and split the cross-field memcpy() into two halves that can be appropriately bounds-checked by the compiler. We were doing: #define ETH_HLEN 14 #define VLAN_HLEN 4 ... #define MLX5E_XDP_MIN_INLINE (ETH_HLEN + VLAN_HLEN) ... struct mlx5e_tx_wqe *wqe = mlx5_wq_cyc_get_wqe(wq, pi); ... struct mlx5_wqe_eth_seg *eseg = &wqe->eth; struct mlx5_wqe_data_seg *dseg = wqe->data; ... memcpy(eseg->inline_hdr.start, xdptxd->data, MLX5E_XDP_MIN_INLINE); target is wqe->eth.inline_hdr.start (which the compiler sees as being 2 bytes in size), but copying 18, intending to write across start (really vlan_tci, 2 bytes). The remaining 16 bytes get written into wqe->data[0], covering byte_count (4 bytes), lkey (4 bytes), and addr (8 bytes). struct mlx5e_tx_wqe { struct mlx5_wqe_ctrl_seg ctrl; /* 0 16 */ struct mlx5_wqe_eth_seg eth; /* 16 16 */ struct mlx5_wqe_data_seg data[]; /* 32 0 */ /* size: 32, cachelines: 1, members: 3 */ /* last cacheline: 32 bytes */ }; struct mlx5_wqe_eth_seg { u8 swp_outer_l4_offset; /* 0 1 */ u8 swp_outer_l3_offset; /* 1 1 */ u8 swp_inner_l4_offset; /* 2 1 */ u8 swp_inner_l3_offset; /* 3 1 */ u8 cs_flags; /* 4 1 */ u8 swp_flags; /* 5 1 */ __be16 mss; /* 6 2 */ __be32 flow_table_metadata; /* 8 4 */ union { struct { __be16 sz; /* 12 2 */ u8 start[2]; /* 14 2 */ } inline_hdr; /* 12 4 */ struct { __be16 type; /* 12 2 */ __be16 vlan_tci; /* 14 2 */ } insert; /* 12 4 */ __be32 trailer; /* 12 4 */ }; /* 12 4 */ /* size: 16, cachelines: 1, members: 9 */ /* last cacheline: 16 bytes */ }; struct mlx5_wqe_data_seg { __be32 byte_count; /* 0 4 */ __be32 lkey; /* 4 4 */ __be64 addr; /* 8 8 */ /* size: 16, cachelines: 1, members: 3 */ /* last cacheline: 16 bytes */ }; So, split the memcpy() so the compiler can reason about the buffer sizes. "pahole" shows no size nor member offset changes to struct mlx5e_tx_wqe nor struct mlx5e_umr_wqe. "objdump -d" shows no meaningful object code changes (i.e. only source line number induced differences and optimizations). Fixes: b5503b994ed5 ("net/mlx5e: XDP TX forwarding support") Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-02net/mlx5e: Use struct_group() for memcpy() regionKees Cook1-1/+1
In preparation for FORTIFY_SOURCE performing compile-time and run-time field bounds checking for memcpy(), memmove(), and memset(), avoid intentionally writing across neighboring fields. Use struct_group() in struct vlan_ethhdr around members h_dest and h_source, so they can be referenced together. This will allow memcpy() and sizeof() to more easily reason about sizes, improve readability, and avoid future warnings about writing beyond the end of h_dest. "pahole" shows no size nor member offset changes to struct vlan_ethhdr. "objdump -d" shows no object code changes. Fixes: 34802a42b352 ("net/mlx5e: Do not modify the TX SKB") Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-02net/mlx5e: Avoid implicit modify hdr for decap drop ruleRoi Dayan1-1/+2
Currently the driver adds implicit modify hdr action for decap rules on tunnel devices if the port is an ovs port. This is also done if the action is drop and makes the modify hdr redundant and also the FW doesn't support it and will generate a syndrome. kernel: mlx5_core 0000:08:00.0: mlx5_cmd_check:777:(pid 102063): SET_FLOW_TABLE_ENTRY(0x936) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x8708c3) Fix it by adding the implicit modify hdr only for fwd actions. Fixes: b16eb3c81fe2 ("net/mlx5: Support internal port as decap route device") Fixes: 077cdda764c7 ("net/mlx5e: TC, Fix memory leak with rules with internal port") Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Ariel Levkovich <lariel@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-02net/mlx5e: IPsec: Fix tunnel mode crypto offload for non TCP/UDP trafficRaed Salem1-2/+11
IPsec Tunnel mode crypto offload software parser (SWP) setting in data path currently always set the inner L4 offset regardless of the encapsulated L4 header type and whether it exists in the first place, this breaks non TCP/UDP traffic as such. Set the SWP inner L4 offset only when the IPsec tunnel encapsulated L4 header protocol is TCP/UDP. While at it fix inner ip protocol read for setting MLX5_ETH_WQE_SWP_INNER_L4_UDP flag to address the case where the ip header protocol is IPv6. Fixes: f1267798c980 ("net/mlx5: Fix checksum issue of VXLAN and IPsec crypto offload") Signed-off-by: Raed Salem <raeds@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-02net/mlx5e: IPsec: Fix crypto offload for non TCP/UDP encapsulated trafficRaed Salem1-3/+6
IPsec crypto offload always set the ethernet segment checksum flags with the inner L4 header checksum flag enabled for encapsulated IPsec offloaded packet regardless of the encapsulated L4 header type, and even if it doesn't exists in the first place, this breaks non TCP/UDP traffic as such. Set the inner L4 checksum flag only when the encapsulated L4 header protocol is TCP/UDP using software parser swp_inner_l4_offset field as indication. Fixes: 5cfb540ef27b ("net/mlx5e: Set IPsec WAs only in IP's non checksum partial case.") Signed-off-by: Raed Salem <raeds@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-02net/mlx5e: Don't treat small ceil values as unlimited in HTB offloadMaxim Mikityanskiy1-1/+2
The hardware spec defines max_average_bw == 0 as "unlimited bandwidth". max_average_bw is calculated as `ceil / BYTES_IN_MBIT`, which can become 0 when ceil is small, leading to an undesired effect of having no bandwidth limit. This commit fixes it by rounding up small values of ceil to 1 Mbit/s. Fixes: 214baf22870c ("net/mlx5e: Support HTB offload") Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-02net/mlx5: E-Switch, Fix uninitialized variable modactMaor Dickman1-1/+1
The variable modact is not initialized before used in command modify header allocation which can cause command to fail. Fix by initializing modact with zeros. Addresses-Coverity: ("Uninitialized scalar variable") Fixes: 8f1e0b97cc70 ("net/mlx5: E-Switch, Mark miss packets with new chain id mapping") Signed-off-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-02net/mlx5e: Fix handling of wrong devices during bond neteventMaor Dickman1-18/+14
Current implementation of bond netevent handler only check if the handled netdev is VF representor and it missing a check if the VF representor is on the same phys device of the bond handling the netevent. Fix by adding the missing check and optimizing the check if the netdev is VF representor so it will not access uninitialized private data and crashes. BUG: kernel NULL pointer dereference, address: 000000000000036c PGD 0 P4D 0 Oops: 0000 [#1] SMP NOPTI Workqueue: eth3bond0 bond_mii_monitor [bonding] RIP: 0010:mlx5e_is_uplink_rep+0xc/0x50 [mlx5_core] RSP: 0018:ffff88812d69fd60 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff8881cf800000 RCX: 0000000000000000 RDX: ffff88812d69fe10 RSI: 000000000000001b RDI: ffff8881cf800880 RBP: ffff8881cf800000 R08: 00000445cabccf2b R09: 0000000000000008 R10: 0000000000000004 R11: 0000000000000008 R12: ffff88812d69fe10 R13: 00000000fffffffe R14: ffff88820c0f9000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88846fb00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000000036c CR3: 0000000103d80006 CR4: 0000000000370ea0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: mlx5e_eswitch_uplink_rep+0x31/0x40 [mlx5_core] mlx5e_rep_is_lag_netdev+0x94/0xc0 [mlx5_core] mlx5e_rep_esw_bond_netevent+0xeb/0x3d0 [mlx5_core] raw_notifier_call_chain+0x41/0x60 call_netdevice_notifiers_info+0x34/0x80 netdev_lower_state_changed+0x4e/0xa0 bond_mii_monitor+0x56b/0x640 [bonding] process_one_work+0x1b9/0x390 worker_thread+0x4d/0x3d0 ? rescuer_thread+0x350/0x350 kthread+0x124/0x150 ? set_kthread_struct+0x40/0x40 ret_from_fork+0x1f/0x30 Fixes: 7e51891a237f ("net/mlx5e: Use netdev events to set/del egress acl forward-to-vport rule") Signed-off-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-02net/mlx5e: Fix broken SKB allocation in HW-GROKhalid Manaa1-9/+17
In case the HW doesn't perform header-data split, it will write the whole packet into the data buffer in the WQ, in this case the SHAMPO CQE handler couldn't use the header entry to build the SKB, instead it should allocate a new memory to build the SKB using the function: mlx5e_skb_from_cqe_mpwrq_nonlinear. Fixes: f97d5c2a453e ("net/mlx5e: Add handle SHAMPO cqe support") Signed-off-by: Khalid Manaa <khalidm@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-02net/mlx5e: Fix wrong calculation of header index in HW_GROKhalid Manaa2-2/+7
The HW doesn't wrap the CQE.shampo.header_index field according to the headers buffer size, instead it always increases it until reaching overflow of u16 size. Thus the mlx5e_handle_rx_cqe_mpwrq_shampo handler should mask the CQE header_index field to find the actual header index in the headers buffer. Fixes: f97d5c2a453e ("net/mlx5e: Add handle SHAMPO cqe support") Signed-off-by: Khalid Manaa <khalidm@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-02net/mlx5: Bridge, Fix devlink deadlock on net namespace deletionRoi Dayan1-2/+2
When changing mode to switchdev, rep bridge init registered to netdevice notifier holds the devlink lock and then takes pernet_ops_rwsem. At that time deleting a netns holds pernet_ops_rwsem and then takes the devlink lock. Example sequence is: $ ip netns add foo $ devlink dev eswitch set pci/0000:00:08.0 mode switchdev & $ ip netns del foo deleting netns trace: [ 1185.365555] ? devlink_pernet_pre_exit+0x74/0x1c0 [ 1185.368331] ? mutex_lock_io_nested+0x13f0/0x13f0 [ 1185.370984] ? xt_find_table+0x40/0x100 [ 1185.373244] ? __mutex_lock+0x24a/0x15a0 [ 1185.375494] ? net_generic+0xa0/0x1c0 [ 1185.376844] ? wait_for_completion_io+0x280/0x280 [ 1185.377767] ? devlink_pernet_pre_exit+0x74/0x1c0 [ 1185.378686] devlink_pernet_pre_exit+0x74/0x1c0 [ 1185.379579] ? devlink_nl_cmd_get_dumpit+0x3a0/0x3a0 [ 1185.380557] ? xt_find_table+0xda/0x100 [ 1185.381367] cleanup_net+0x372/0x8e0 changing mode to switchdev trace: [ 1185.411267] down_write+0x13a/0x150 [ 1185.412029] ? down_write_killable+0x180/0x180 [ 1185.413005] register_netdevice_notifier+0x1e/0x210 [ 1185.414000] mlx5e_rep_bridge_init+0x181/0x360 [mlx5_core] [ 1185.415243] mlx5e_uplink_rep_enable+0x269/0x480 [mlx5_core] [ 1185.416464] ? mlx5e_uplink_rep_disable+0x210/0x210 [mlx5_core] [ 1185.417749] mlx5e_attach_netdev+0x232/0x400 [mlx5_core] [ 1185.418906] mlx5e_netdev_attach_profile+0x15b/0x1e0 [mlx5_core] [ 1185.420172] mlx5e_netdev_change_profile+0x15a/0x1d0 [mlx5_core] [ 1185.421459] mlx5e_vport_rep_load+0x557/0x780 [mlx5_core] [ 1185.422624] ? mlx5e_stats_grp_vport_rep_num_stats+0x10/0x10 [mlx5_core] [ 1185.424006] mlx5_esw_offloads_rep_load+0xdb/0x190 [mlx5_core] [ 1185.425277] esw_offloads_enable+0xd74/0x14a0 [mlx5_core] Fix this by registering rep bridges for per net netdev notifier instead of global one, which operats on the net namespace without holding the pernet_ops_rwsem. Fixes: 19e9bfa044f3 ("net/mlx5: Bridge, add offload infrastructure") Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-02net/mlx5: Fix offloading with ESWITCH_IPV4_TTL_MODIFY_ENABLEDima Chumak1-3/+4
Only prio 1 is supported for nic mode when there is no ignore flow level support in firmware. But for switchdev mode, which supports fixed number of statically pre-allocated prios, this restriction is not relevant so it can be relaxed. Fixes: d671e109bd85 ("net/mlx5: Fix tc max supported prio for nic mode") Signed-off-by: Dima Chumak <dchumak@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-02net/mlx5e: TC, Reject rules with forward and drop actionsRoi Dayan1-0/+6
Such rules are redundant but allowed and passed to the driver. The driver does not support offloading such rules so return an error. Fixes: 03a9d11e6eeb ("net/mlx5e: Add TC drop and mirred/redirect action parsing for SRIOV offloads") Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>