summaryrefslogtreecommitdiff
path: root/drivers/net
AgeCommit message (Collapse)AuthorFilesLines
2023-07-14net: dsa: ar9331: Use explict flags for regmap single read/writeMark Brown1-2/+2
The at9331 is only able to read or write a single register at once. The driver has a custom regmap bus and chooses to tell the regmap core about this by reporting the maximum transfer sizes rather than the explicit flags that exist at the regmap level. Since there are a number of problems with the raw transfer limits and the regmap level flags are better integrated anyway convert the driver to use the flags. No functional change. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-07-14net: usbnet: Fix WARNING in usbnet_start_xmit/usb_submit_urbAlan Stern1-0/+6
The syzbot fuzzer identified a problem in the usbnet driver: usb 1-1: BOGUS urb xfer, pipe 3 != type 1 WARNING: CPU: 0 PID: 754 at drivers/usb/core/urb.c:504 usb_submit_urb+0xed6/0x1880 drivers/usb/core/urb.c:504 Modules linked in: CPU: 0 PID: 754 Comm: kworker/0:2 Not tainted 6.4.0-rc7-syzkaller-00014-g692b7dc87ca6 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/27/2023 Workqueue: mld mld_ifc_work RIP: 0010:usb_submit_urb+0xed6/0x1880 drivers/usb/core/urb.c:504 Code: 7c 24 18 e8 2c b4 5b fb 48 8b 7c 24 18 e8 42 07 f0 fe 41 89 d8 44 89 e1 4c 89 ea 48 89 c6 48 c7 c7 a0 c9 fc 8a e8 5a 6f 23 fb <0f> 0b e9 58 f8 ff ff e8 fe b3 5b fb 48 81 c5 c0 05 00 00 e9 84 f7 RSP: 0018:ffffc9000463f568 EFLAGS: 00010086 RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000 RDX: ffff88801eb28000 RSI: ffffffff814c03b7 RDI: 0000000000000001 RBP: ffff8881443b7190 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000003 R13: ffff88802a77cb18 R14: 0000000000000003 R15: ffff888018262500 FS: 0000000000000000(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000556a99c15a18 CR3: 0000000028c71000 CR4: 0000000000350ef0 Call Trace: <TASK> usbnet_start_xmit+0xfe5/0x2190 drivers/net/usb/usbnet.c:1453 __netdev_start_xmit include/linux/netdevice.h:4918 [inline] netdev_start_xmit include/linux/netdevice.h:4932 [inline] xmit_one net/core/dev.c:3578 [inline] dev_hard_start_xmit+0x187/0x700 net/core/dev.c:3594 ... This bug is caused by the fact that usbnet trusts the bulk endpoint addresses its probe routine receives in the driver_info structure, and it does not check to see that these endpoints actually exist and have the expected type and directions. The fix is simply to add such a check. Reported-and-tested-by: syzbot+63ee658b9a100ffadbe2@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-usb/000000000000a56e9105d0cec021@google.com/ Signed-off-by: Alan Stern <stern@rowland.harvard.edu> CC: Oliver Neukum <oneukum@suse.com> Link: https://lore.kernel.org/r/ea152b6d-44df-4f8a-95c6-4db51143dcc1@rowland.harvard.edu Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-07-14dsa: mv88e6xxx: Do a final check before timing outLinus Walleij1-0/+7
I get sporadic timeouts from the driver when using the MV88E6352. Reading the status again after the loop fixes the problem: the operation is successful but goes undetected. Some added prints show things like this: [ 58.356209] mv88e6085 mdio_mux-0.1:00: Timeout while waiting for switch, addr 1b reg 0b, mask 8000, val 0000, data c000 [ 58.367487] mv88e6085 mdio_mux-0.1:00: Timeout waiting for ATU op 4000, fid 0001 (...) [ 61.826293] mv88e6085 mdio_mux-0.1:00: Timeout while waiting for switch, addr 1c reg 18, mask 8000, val 0000, data 9860 [ 61.837560] mv88e6085 mdio_mux-0.1:00: Timeout waiting for PHY command 1860 to complete The reason is probably not the commands: I think those are mostly fine with the 50+50ms timeout, but the problem appears when OpenWrt brings up several interfaces in parallel on a system with 7 populated ports: if one of them take more than 50 ms and waits one or more of the others can get stuck on the mutex for the switch and then this can easily multiply. As we sleep and wait, the function loop needs a final check after exiting the loop if we were successful. Suggested-by: Andrew Lunn <andrew@lunn.ch> Cc: Tobias Waldekranz <tobias@waldekranz.com> Fixes: 35da1dfd9484 ("net: dsa: mv88e6xxx: Improve performance of busy bit polling") Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20230712223405.861899-1-linus.walleij@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-07-14Merge tag 'net-6.5-rc2' of ↵Linus Torvalds57-295/+531
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from netfilter, wireless and ebpf. Current release - regressions: - netfilter: conntrack: gre: don't set assured flag for clash entries - wifi: iwlwifi: remove 'use_tfh' config to fix crash Previous releases - regressions: - ipv6: fix a potential refcount underflow for idev - icmp6: ifix null-ptr-deref of ip6_null_entry->rt6i_idev in icmp6_dev() - bpf: fix max stack depth check for async callbacks - eth: mlx5e: - check for NOT_READY flag state after locking - fix page_pool page fragment tracking for XDP - eth: igc: - fix tx hang issue when QBV gate is closed - fix corner cases for TSN offload - eth: octeontx2-af: Move validation of ptp pointer before its usage - eth: ena: fix shift-out-of-bounds in exponential backoff Previous releases - always broken: - core: prevent skb corruption on frag list segmentation - sched: - cls_fw: fix improper refcount update leads to use-after-free - sch_qfq: account for stab overhead in qfq_enqueue - netfilter: - report use refcount overflow - prevent OOB access in nft_byteorder_eval - wifi: mt7921e: fix init command fail with enabled device - eth: ocelot: fix oversize frame dropping for preemptible TCs - eth: fec: recycle pages for transmitted XDP frames" * tag 'net-6.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (79 commits) selftests: tc-testing: add test for qfq with stab overhead net/sched: sch_qfq: account for stab overhead in qfq_enqueue selftests: tc-testing: add tests for qfq mtu sanity check net/sched: sch_qfq: reintroduce lmax bound check for MTU wifi: cfg80211: fix receiving mesh packets without RFC1042 header wifi: rtw89: debug: fix error code in rtw89_debug_priv_send_h2c_set() net: txgbe: fix eeprom calculation error net/sched: make psched_mtu() RTNL-less safe net: ena: fix shift-out-of-bounds in exponential backoff netdevsim: fix uninitialized data in nsim_dev_trap_fa_cookie_write() net/sched: flower: Ensure both minimum and maximum ports are specified MAINTAINERS: Add another mailing list for QUALCOMM ETHQOS ETHERNET DRIVER docs: netdev: update the URL of the status page wifi: iwlwifi: remove 'use_tfh' config to fix crash xdp: use trusted arguments in XDP hints kfuncs bpf: cpumap: Fix memory leak in cpu_map_update_elem wifi: airo: avoid uninitialized warning in airo_get_rate() octeontx2-pf: Add additional check for MCAM rules net: dsa: Removed unneeded of_node_put in felix_parse_ports_node net: fec: use netdev_err_once() instead of netdev_err() ...
2023-07-13wifi: rtw89: debug: fix error code in rtw89_debug_priv_send_h2c_set()Zhang Shurong1-2/+3
If there is a failure during rtw89_fw_h2c_raw() rtw89_debug_priv_send_h2c should return negative error code instead of a positive value count. Fix this bug by returning correct error code. Fixes: e3ec7017f6a2 ("rtw89: add Realtek 802.11ax driver") Signed-off-by: Zhang Shurong <zhang_shurong@foxmail.com> Acked-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://lore.kernel.org/r/tencent_AD09A61BC4DA92AD1EB0790F5C850E544D07@qq.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-07-13net: txgbe: fix eeprom calculation errorJiawen Wu1-3/+0
For some device types like TXGBE_ID_XAUI, *checksum computed in txgbe_calc_eeprom_checksum() is larger than TXGBE_EEPROM_SUM. Remove the limit on the size of *checksum. Fixes: 049fe5365324 ("net: txgbe: Add operations to interact with firmware") Fixes: 5e2ea7801fac ("net: txgbe: Fix unsigned comparison to zero in txgbe_calc_eeprom_checksum()") Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Link: https://lore.kernel.org/r/20230711063414.3311-1-jiawenwu@trustnetic.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-07-13net: ena: fix shift-out-of-bounds in exponential backoffKrister Johansen1-0/+3
The ENA adapters on our instances occasionally reset. Once recently logged a UBSAN failure to console in the process: UBSAN: shift-out-of-bounds in build/linux/drivers/net/ethernet/amazon/ena/ena_com.c:540:13 shift exponent 32 is too large for 32-bit type 'unsigned int' CPU: 28 PID: 70012 Comm: kworker/u72:2 Kdump: loaded not tainted 5.15.117 Hardware name: Amazon EC2 c5d.9xlarge/, BIOS 1.0 10/16/2017 Workqueue: ena ena_fw_reset_device [ena] Call Trace: <TASK> dump_stack_lvl+0x4a/0x63 dump_stack+0x10/0x16 ubsan_epilogue+0x9/0x36 __ubsan_handle_shift_out_of_bounds.cold+0x61/0x10e ? __const_udelay+0x43/0x50 ena_delay_exponential_backoff_us.cold+0x16/0x1e [ena] wait_for_reset_state+0x54/0xa0 [ena] ena_com_dev_reset+0xc8/0x110 [ena] ena_down+0x3fe/0x480 [ena] ena_destroy_device+0xeb/0xf0 [ena] ena_fw_reset_device+0x30/0x50 [ena] process_one_work+0x22b/0x3d0 worker_thread+0x4d/0x3f0 ? process_one_work+0x3d0/0x3d0 kthread+0x12a/0x150 ? set_kthread_struct+0x50/0x50 ret_from_fork+0x22/0x30 </TASK> Apparently, the reset delays are getting so large they can trigger a UBSAN panic. Looking at the code, the current timeout is capped at 5000us. Using a base value of 100us, the current code will overflow after (1<<29). Even at values before 32, this function wraps around, perhaps unintentionally. Cap the value of the exponent used for this backoff at (1<<16) which is larger than currently necessary, but large enough to support bigger values in the future. Cc: stable@vger.kernel.org Fixes: 4bb7f4cf60e3 ("net: ena: reduce driver load time") Signed-off-by: Krister Johansen <kjlx@templeofstupid.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Shay Agroskin <shayagr@amazon.com> Link: https://lore.kernel.org/r/20230711013621.GE1926@templeofstupid.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-07-12netdevsim: fix uninitialized data in nsim_dev_trap_fa_cookie_write()Dan Carpenter1-6/+3
The simple_write_to_buffer() function is designed to handle partial writes. It returns negatives on error, otherwise it returns the number of bytes that were able to be copied. This code doesn't check the return properly. We only know that the first byte is written, the rest of the buffer might be uninitialized. There is no need to use the simple_write_to_buffer() function. Partial writes are prohibited by the "if (*ppos != 0)" check at the start of the function. Just use memdup_user() and copy the whole buffer. Fixes: d3cbb907ae57 ("netdevsim: add ACL trap reporting cookie as a metadata") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/7c1f950b-3a7d-4252-82a6-876e53078ef7@moroto.mountain Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-07-12Merge branch '1GbE' of ↵David S. Miller3-19/+33
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== igc: Fix corner cases for TSN offload Florian Kauer says: The igc driver supports several different offloading capabilities relevant in the TSN context. Recent patches in this area introduced regressions for certain corner cases that are fixed in this series. Each of the patches (except the first one) addresses a different regression that can be separately reproduced. Still, they have overlapping code changes so they should not be separately applied. Especially #4 and #6 address the same observation, but both need to be applied to avoid TX hang occurrences in the scenario described in the patches. ==================== Signed-off-by: Florian Kauer <florian.kauer@linutronix.de> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Reviewed-by: Muhammad Husaini Zulkifli <muhammad.husaini.zulkifli@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-07-12wifi: iwlwifi: remove 'use_tfh' config to fix crashJohannes Berg9-25/+18
This is equivalent to 'gen2', and it was always confusing to have two identical config entries. The split config patch actually had been originally developed after removing 'use_tfh" and didn't add the use_tfh in the new configs as they'd later been copied to the new files. Thus the easiest way to fix the init crash here now is to just remove use_tfh (which is erroneously unset in most of the configs now) and use 'gen2' in the code instead. There's possibly still an unwind error in iwl_txq_gen2_init() as it crashes if TXQ 0 fails to initialize, but we can deal with it later since the original failure is due to the use_tfh confusion. Tested-by: Xi Ruoyao <xry111@xry111.site> Reported-and-tested-by: Niklāvs Koļesņikovs <pinkflames.linux@gmail.com> Reported-and-tested-by: Jeff Chua <jeff.chua.linux@gmail.com> Reported-and-tested-by: Zhang Rui <rui.zhang@intel.com> Link: https://bugzilla.kernel.org/show_bug.cgi?id=217622 Link: https://lore.kernel.org/all/9274d9bd3d080a457649ff5addcc1726f08ef5b2.camel@xry111.site/ Link: https://lore.kernel.org/all/CAAJw_Zug6VCS5ZqTWaFSr9sd85k%3DtyPm9DEE%2BmV%3DAKoECZM%2BsQ@mail.gmail.com/ Fixes: 19898ce9cf8a ("wifi: iwlwifi: split 22000.c into multiple files") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Link: https://lore.kernel.org/r/20230710145038.84186-2-johannes@sipsolutions.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-07-12wifi: airo: avoid uninitialized warning in airo_get_rate()Randy Dunlap1-1/+4
Quieten a gcc (11.3.0) build error or warning by checking the function call status and returning -EBUSY if the function call failed. This is similar to what several other wireless drivers do for the SIOCGIWRATE ioctl call when there is a locking problem. drivers/net/wireless/cisco/airo.c: error: 'status_rid.currentXmitRate' is used uninitialized [-Werror=uninitialized] Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Link: https://lore.kernel.org/r/39abf2c7-24a-f167-91da-ed4c5435d1c4@linux-m68k.org Link: https://lore.kernel.org/r/20230709133154.26206-1-rdunlap@infradead.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-07-11octeontx2-pf: Add additional check for MCAM rulesSuman Ghosh2-0/+23
Due to hardware limitation, MCAM drop rule with ether_type == 802.1Q and vlan_id == 0 is not supported. Hence rejecting such rules. Fixes: dce677da57c0 ("octeontx2-pf: Add vlan-etype to ntuple filters") Signed-off-by: Suman Ghosh <sumang@marvell.com> Link: https://lore.kernel.org/r/20230710103027.2244139-1-sumang@marvell.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-07-11net: dsa: Removed unneeded of_node_put in felix_parse_ports_nodeLu Hongfei1-1/+0
Remove unnecessary of_node_put from the continue path to prevent child node from being released twice, which could avoid resource leak or other unexpected issues. Signed-off-by: Lu Hongfei <luhongfei@vivo.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Fixes: de879a016a94 ("net: dsa: felix: add functionality when not all ports are supported") Link: https://lore.kernel.org/r/20230710031859.36784-1-luhongfei@vivo.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-07-11net: fec: use netdev_err_once() instead of netdev_err()Wei Fang1-1/+1
In the case of heavy XDP traffic to be transmitted, the console will print the error log continuously if there are lack of enough BDs to accommodate the frames. The log looks like below. [ 160.013112] fec 30be0000.ethernet eth0: NOT enough BD for SG! [ 160.023116] fec 30be0000.ethernet eth0: NOT enough BD for SG! [ 160.028926] fec 30be0000.ethernet eth0: NOT enough BD for SG! [ 160.038946] fec 30be0000.ethernet eth0: NOT enough BD for SG! [ 160.044758] fec 30be0000.ethernet eth0: NOT enough BD for SG! Not only will this log be replicated and redundant, it will also degrade XDP performance. So we use netdev_err_once() instead of netdev_err() now. Fixes: 6d6b39f180b8 ("net: fec: add initial XDP support") Signed-off-by: Wei Fang <wei.fang@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-07-11net: fec: increase the size of tx ring and update tx_wake_thresholdWei Fang2-3/+2
When the XDP feature is enabled and with heavy XDP frames to be transmitted, there is a considerable probability that available tx BDs are insufficient. This will lead to some XDP frames to be discarded and the "NOT enough BD for SG!" error log will appear in the console (as shown below). [ 160.013112] fec 30be0000.ethernet eth0: NOT enough BD for SG! [ 160.023116] fec 30be0000.ethernet eth0: NOT enough BD for SG! [ 160.028926] fec 30be0000.ethernet eth0: NOT enough BD for SG! [ 160.038946] fec 30be0000.ethernet eth0: NOT enough BD for SG! [ 160.044758] fec 30be0000.ethernet eth0: NOT enough BD for SG! In the case of heavy XDP traffic, sometimes the speed of recycling tx BDs may be slower than the speed of sending XDP frames. There may be several specific reasons, such as the interrupt is not responsed in time, the efficiency of the NAPI callback function is too low due to all the queues (tx queues and rx queues) share the same NAPI, and so on. After trying various methods, I think that increase the size of tx BD ring is simple and effective. Maybe the best resolution is that allocate NAPI for each queue to improve the efficiency of the NAPI callback, but this change is a bit big and I didn't try this method. Perheps this method will be implemented in a future patch. This patch also updates the tx_wake_threshold of tx ring which is related to the size of tx ring in the previous logic. Otherwise, the tx_wake_threshold will be too high (403 BDs), which is more likely to impact the slow path in the case of heavy XDP traffic, because XDP path and slow path share the tx BD rings. According to Jakub's suggestion, the tx_wake_threshold is at least equal to tx_stop_threshold + 2 * MAX_SKB_FRAGS, if a queue of hundreds of entries is overflowing, we should be able to apply a hysteresis of a few tens of entries. Fixes: 6d6b39f180b8 ("net: fec: add initial XDP support") Signed-off-by: Wei Fang <wei.fang@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-07-11net: fec: recycle pages for transmitted XDP framesWei Fang2-48/+115
Once the XDP frames have been successfully transmitted through the ndo_xdp_xmit() interface, it's the driver responsibility to free the frames so that the page_pool can recycle the pages and reuse them. However, this action is not implemented in the fec driver. This leads to a user-visible problem that the console will print the following warning log. [ 157.568851] page_pool_release_retry() stalled pool shutdown 1389 inflight 60 sec [ 217.983446] page_pool_release_retry() stalled pool shutdown 1389 inflight 120 sec [ 278.399006] page_pool_release_retry() stalled pool shutdown 1389 inflight 181 sec [ 338.812885] page_pool_release_retry() stalled pool shutdown 1389 inflight 241 sec [ 399.226946] page_pool_release_retry() stalled pool shutdown 1389 inflight 302 sec Therefore, to solve this issue, we free XDP frames via xdp_return_frame() while cleaning the tx BD ring. Fixes: 6d6b39f180b8 ("net: fec: add initial XDP support") Signed-off-by: Wei Fang <wei.fang@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-07-11net: fec: dynamically set the NETDEV_XDP_ACT_NDO_XMIT feature of XDPWei Fang1-4/+9
When a XDP program is installed or uninstalled, fec_restart() will be invoked to reset MAC and buffer descriptor rings. It's reasonable not to transmit any packet during the process of reset. However, the NETDEV_XDP_ACT_NDO_XMIT bit of xdp_features is enabled by default, that is to say, it's possible that the fec_enet_xdp_xmit() will be invoked even if the process of reset is not finished. In this case, the redirected XDP frames might be dropped and available transmit BDs may be incorrectly deemed insufficient. So this patch disable the NETDEV_XDP_ACT_NDO_XMIT feature by default and dynamically configure this feature when the bpf program is installed or uninstalled. Fixes: e4ac7cc6e5a4 ("net: fec: turn on XDP features") Signed-off-by: Wei Fang <wei.fang@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-07-10igc: Fix inserting of empty frame for launchtimeFlorian Kauer1-1/+1
The insertion of an empty frame was introduced with commit db0b124f02ba ("igc: Enhance Qbv scheduling by using first flag bit") in order to ensure that the current cycle has at least one packet if there is some packet to be scheduled for the next cycle. However, the current implementation does not properly check if a packet is already scheduled for the current cycle. Currently, an empty packet is always inserted if and only if txtime >= end_of_cycle && txtime > last_tx_cycle but since last_tx_cycle is always either the end of the current cycle (end_of_cycle) or the end of a previous cycle, the second part (txtime > last_tx_cycle) is always true unless txtime == last_tx_cycle. What actually needs to be checked here is if the last_tx_cycle was already written within the current cycle, so an empty frame should only be inserted if and only if txtime >= end_of_cycle && end_of_cycle > last_tx_cycle. This patch does not only avoid an unnecessary insertion, but it can actually be harmful to insert an empty packet if packets are already scheduled in the current cycle, because it can lead to a situation where the empty packet is actually processed as the first packet in the upcoming cycle shifting the packet with the first_flag even one cycle into the future, finally leading to a TX hang. The TX hang can be reproduced on a i225 with: sudo tc qdisc replace dev enp1s0 parent root handle 100 taprio \ num_tc 1 \ map 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 \ queues 1@0 \ base-time 0 \ sched-entry S 01 300000 \ flags 0x1 \ txtime-delay 500000 \ clockid CLOCK_TAI sudo tc qdisc replace dev enp1s0 parent 100:1 etf \ clockid CLOCK_TAI \ delta 500000 \ offload \ skip_sock_check and traffic generator sudo trafgen -i traffic.cfg -o enp1s0 --cpp -n0 -q -t1400ns with traffic.cfg #define ETH_P_IP 0x0800 { /* Ethernet Header */ 0x30, 0x1f, 0x9a, 0xd0, 0xf0, 0x0e, # MAC Dest - adapt as needed 0x24, 0x5e, 0xbe, 0x57, 0x2e, 0x36, # MAC Src - adapt as needed const16(ETH_P_IP), /* IPv4 Header */ 0b01000101, 0, # IPv4 version, IHL, TOS const16(1028), # IPv4 total length (UDP length + 20 bytes (IP header)) const16(2), # IPv4 ident 0b01000000, 0, # IPv4 flags, fragmentation off 64, # IPv4 TTL 17, # Protocol UDP csumip(14, 33), # IPv4 checksum /* UDP Header */ 10, 0, 48, 1, # IP Src - adapt as needed 10, 0, 48, 10, # IP Dest - adapt as needed const16(5555), # UDP Src Port const16(6666), # UDP Dest Port const16(1008), # UDP length (UDP header 8 bytes + payload length) csumudp(14, 34), # UDP checksum /* Payload */ fill('W', 1000), } and the observed message with that is for example igc 0000:01:00.0 enp1s0: Detected Tx Unit Hang Tx Queue <0> TDH <32> TDT <3c> next_to_use <3c> next_to_clean <32> buffer_info[next_to_clean] time_stamp <ffff26a8> next_to_watch <00000000632a1828> jiffies <ffff27f8> desc.status <1048000> Fixes: db0b124f02ba ("igc: Enhance Qbv scheduling by using first flag bit") Signed-off-by: Florian Kauer <florian.kauer@linutronix.de> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-07-10igc: Fix launchtime before start of cycleFlorian Kauer1-1/+1
It is possible (verified on a running system) that frames are processed by igc_tx_launchtime with a txtime before the start of the cycle (baset_est). However, the result of txtime - baset_est is written into a u32, leading to a wrap around to a positive number. The following launchtime > 0 check will only branch to executing launchtime = 0 if launchtime is already 0. Fix it by using a s32 before checking launchtime > 0. Fixes: db0b124f02ba ("igc: Enhance Qbv scheduling by using first flag bit") Signed-off-by: Florian Kauer <florian.kauer@linutronix.de> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-07-10igc: No strict mode in pure launchtime/CBS offloadFlorian Kauer1-2/+22
The flags IGC_TXQCTL_STRICT_CYCLE and IGC_TXQCTL_STRICT_END prevent the packet transmission over slot and cycle boundaries. This is important for taprio offload where the slots and cycles correspond to the slots and cycles configured for the network. However, the Qbv offload feature of the i225 is also used for enabling TX launchtime / ETF offload. In that case, however, the cycle has no meaning for the network and is only used internally to adapt the base time register after a second has passed. Enabling strict mode in this case would unnecessarily prevent the transmission of certain packets (i.e. at the boundary of a second) and thus interferes with the ETF qdisc that promises transmission at a certain point in time. Similar to ETF, this also applies to CBS offload that also should not be influenced by strict mode unless taprio offload would be enabled at the same time. This fully reverts commit d8f45be01dd9 ("igc: Use strict cycles for Qbv scheduling") but its commit message only describes what was already implemented before that commit. The difference to a plain revert of that commit is that it now copes with the base_time = 0 case that was fixed with commit e17090eb2494 ("igc: allow BaseTime 0 enrollment for Qbv") In particular, enabling strict mode leads to TX hang situations under high traffic if taprio is applied WITHOUT taprio offload but WITH ETF offload, e.g. as in sudo tc qdisc replace dev enp1s0 parent root handle 100 taprio \ num_tc 1 \ map 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 \ queues 1@0 \ base-time 0 \ sched-entry S 01 300000 \ flags 0x1 \ txtime-delay 500000 \ clockid CLOCK_TAI sudo tc qdisc replace dev enp1s0 parent 100:1 etf \ clockid CLOCK_TAI \ delta 500000 \ offload \ skip_sock_check and traffic generator sudo trafgen -i traffic.cfg -o enp1s0 --cpp -n0 -q -t1400ns with traffic.cfg #define ETH_P_IP 0x0800 { /* Ethernet Header */ 0x30, 0x1f, 0x9a, 0xd0, 0xf0, 0x0e, # MAC Dest - adapt as needed 0x24, 0x5e, 0xbe, 0x57, 0x2e, 0x36, # MAC Src - adapt as needed const16(ETH_P_IP), /* IPv4 Header */ 0b01000101, 0, # IPv4 version, IHL, TOS const16(1028), # IPv4 total length (UDP length + 20 bytes (IP header)) const16(2), # IPv4 ident 0b01000000, 0, # IPv4 flags, fragmentation off 64, # IPv4 TTL 17, # Protocol UDP csumip(14, 33), # IPv4 checksum /* UDP Header */ 10, 0, 48, 1, # IP Src - adapt as needed 10, 0, 48, 10, # IP Dest - adapt as needed const16(5555), # UDP Src Port const16(6666), # UDP Dest Port const16(1008), # UDP length (UDP header 8 bytes + payload length) csumudp(14, 34), # UDP checksum /* Payload */ fill('W', 1000), } and the observed message with that is for example igc 0000:01:00.0 enp1s0: Detected Tx Unit Hang Tx Queue <0> TDH <d0> TDT <f0> next_to_use <f0> next_to_clean <d0> buffer_info[next_to_clean] time_stamp <ffff661f> next_to_watch <00000000245a4efb> jiffies <ffff6e48> desc.status <1048000> Fixes: d8f45be01dd9 ("igc: Use strict cycles for Qbv scheduling") Signed-off-by: Florian Kauer <florian.kauer@linutronix.de> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-07-10igc: Handle already enabled taprio offload for basetime 0Florian Kauer1-1/+1
Since commit e17090eb2494 ("igc: allow BaseTime 0 enrollment for Qbv") it is possible to enable taprio offload with a basetime of 0. However, the check if taprio offload is already enabled (and thus -EALREADY should be returned for igc_save_qbv_schedule) still relied on adapter->base_time > 0. This can be reproduced as follows: # TAPRIO offload (flags == 0x2) and base-time = 0 sudo tc qdisc replace dev enp1s0 parent root handle 100 stab overhead 24 taprio \ num_tc 1 \ map 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 \ queues 1@0 \ base-time 0 \ sched-entry S 01 300000 \ flags 0x2 # The second call should fail with "Error: Device failed to setup taprio offload." # But that only happens if base-time was != 0 sudo tc qdisc replace dev enp1s0 parent root handle 100 stab overhead 24 taprio \ num_tc 1 \ map 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 \ queues 1@0 \ base-time 0 \ sched-entry S 01 300000 \ flags 0x2 Fixes: e17090eb2494 ("igc: allow BaseTime 0 enrollment for Qbv") Signed-off-by: Florian Kauer <florian.kauer@linutronix.de> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-07-10igc: Do not enable taprio offload for invalid argumentsFlorian Kauer1-12/+6
Only set adapter->taprio_offload_enable after validating the arguments. Otherwise, it stays set even if the offload was not enabled. Since the subsequent code does not get executed in case of invalid arguments, it will not be read at first. However, by activating and then deactivating another offload (e.g. ETF/TX launchtime offload), taprio_offload_enable is read and erroneously keeps the offload feature of the NIC enabled. This can be reproduced as follows: # TAPRIO offload (flags == 0x2) and negative base-time leading to expected -ERANGE sudo tc qdisc replace dev enp1s0 parent root handle 100 stab overhead 24 taprio \ num_tc 1 \ map 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 \ queues 1@0 \ base-time -1000 \ sched-entry S 01 300000 \ flags 0x2 # IGC_TQAVCTRL is 0x0 as expected (iomem=relaxed for reading register) sudo pcimem /sys/bus/pci/devices/0000:01:00.0/resource0 0x3570 w*1 # Activate ETF offload sudo tc qdisc replace dev enp1s0 parent root handle 6666 mqprio \ num_tc 3 \ map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \ queues 1@0 1@1 2@2 \ hw 0 sudo tc qdisc add dev enp1s0 parent 6666:1 etf \ clockid CLOCK_TAI \ delta 500000 \ offload # IGC_TQAVCTRL is 0x9 as expected sudo pcimem /sys/bus/pci/devices/0000:01:00.0/resource0 0x3570 w*1 # Deactivate ETF offload again sudo tc qdisc delete dev enp1s0 parent 6666:1 # IGC_TQAVCTRL should now be 0x0 again, but is observed as 0x9 sudo pcimem /sys/bus/pci/devices/0000:01:00.0/resource0 0x3570 w*1 Fixes: e17090eb2494 ("igc: allow BaseTime 0 enrollment for Qbv") Signed-off-by: Florian Kauer <florian.kauer@linutronix.de> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-07-10igc: Rename qbv_enable to taprio_offload_enableFlorian Kauer3-5/+5
In the current implementation the flags adapter->qbv_enable and IGC_FLAG_TSN_QBV_ENABLED have a similar name, but do not have the same meaning. The first one is used only to indicate taprio offload (i.e. when igc_save_qbv_schedule was called), while the second one corresponds to the Qbv mode of the hardware. However, the second one is also used to support the TX launchtime feature, i.e. ETF qdisc offload. This leads to situations where adapter->qbv_enable is false, but the flag IGC_FLAG_TSN_QBV_ENABLED is set. This is prone to confusion. The rename should reduce this confusion. Since it is a pure rename, it has no impact on functionality. Fixes: e17090eb2494 ("igc: allow BaseTime 0 enrollment for Qbv") Signed-off-by: Florian Kauer <florian.kauer@linutronix.de> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-07-10gve: unify driver name usageJunfeng Guo3-6/+8
Current codebase contained the usage of two different names for this driver (i.e., `gvnic` and `gve`), which is quite unfriendly for users to use, especially when trying to bind or unbind the driver manually. The corresponding kernel module is registered with the name of `gve`. It's more reasonable to align the name of the driver with the module. Fixes: 893ce44df565 ("gve: Add basic driver framework for Compute Engine Virtual NIC") Cc: csully@google.com Signed-off-by: Junfeng Guo <junfeng.guo@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-07-10net: dsa: qca8k: Add check for skb_copyJiasheng Jiang1-0/+3
Add check for the return value of skb_copy in order to avoid NULL pointer dereference. Fixes: 2cd548566384 ("net: dsa: qca8k: add support for phy read/write with mgmt Ethernet") Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-07-09Merge tag 'ntb-6.5' of https://github.com/jonmason/ntbLinus Torvalds1-1/+1
Pull NTB updates from Jon Mason: "Fixes for pci_clean_master, error handling in driver inits, and various other issues/bugs" * tag 'ntb-6.5' of https://github.com/jonmason/ntb: ntb: hw: amd: Fix debugfs_create_dir error checking ntb.rst: Fix copy and paste error ntb_netdev: Fix module_init problem ntb: intel: Remove redundant pci_clear_master ntb: epf: Remove redundant pci_clear_master ntb_hw_amd: Remove redundant pci_clear_master ntb: idt: drop redundant pci_enable_pcie_error_reporting() MAINTAINERS: git://github -> https://github.com for jonmason NTB: EPF: fix possible memory leak in pci_vntb_probe() NTB: ntb_tool: Add check for devm_kcalloc NTB: ntb_transport: fix possible memory leak while device_register() fails ntb: intel: Fix error handling in intel_ntb_pci_driver_init() NTB: amd: Fix error handling in amd_ntb_pci_driver_init() ntb: idt: Fix error handling in idt_pci_driver_init()
2023-07-09net: lan743x: select FIXED_PHYSimon Horman1-1/+1
The blamed commit introduces usage of fixed_phy_register() but not a corresponding dependency on FIXED_PHY. This can result in a build failure. s390-linux-ld: drivers/net/ethernet/microchip/lan743x_main.o: in function `lan743x_phy_open': drivers/net/ethernet/microchip/lan743x_main.c:1514: undefined reference to `fixed_phy_register' Fixes: 624864fbff92 ("net: lan743x: add fixed phy support for LAN7431 device") Cc: stable@vger.kernel.org Reported-by: Randy Dunlap <rdunlap@infradead.org> Closes: https://lore.kernel.org/netdev/725bf1c5-b252-7d19-7582-a6809716c7d6@infradead.org/ Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Signed-off-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-07-08ntb_netdev: Fix module_init problemGeoff Levand1-1/+1
With both the ntb_transport_init and the ntb_netdev_init_module routines in the module_init init group, the ntb_netdev_init_module routine can be called before the ntb_transport_init routine that it depends on is called. To assure the proper initialization order put ntb_netdev_init_module in the late_initcall group. Fixes runtime errors where the ntb_netdev_init_module call fails with ENODEV. Signed-off-by: Geoff Levand <geoff@infradead.org> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2023-07-08net: bgmac: postpone turning IRQs off to avoid SoC hangsRafał Miłecki1-2/+2
Turning IRQs off is done by accessing Ethernet controller registers. That can't be done until device's clock is enabled. It results in a SoC hang otherwise. This bug remained unnoticed for years as most bootloaders keep all Ethernet interfaces turned on. It seems to only affect a niche SoC family BCM47189. It has two Ethernet controllers but CFE bootloader uses only the first one. Fixes: 34322615cbaa ("net: bgmac: Mask interrupts during probe") Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-07-07ionic: remove dead device fail pathShannon Nelson1-6/+0
Remove the probe error path code that leaves the driver bound to the device, but with essentially a dead device. This was useful maybe twice early in the driver's life and no longer makes sense to keep. Fixes: 30a1e6d0f8e2 ("ionic: keep ionic dev on lif init fail") Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-07-07ionic: remove WARN_ON to prevent panic_on_warnNitya Sunkad1-5/+0
Remove unnecessary early code development check and the WARN_ON that it uses. The irq alloc and free paths have long been cleaned up and this check shouldn't have stuck around so long. Fixes: 77ceb68e29cc ("ionic: Add notifyq support") Signed-off-by: Nitya Sunkad <nitya.sunkad@amd.com> Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-07-07octeontx2-af: Move validation of ptp pointer before its usageSai Krishna2-11/+10
Moved PTP pointer validation before its use to avoid smatch warning. Also used kzalloc/kfree instead of devm_kzalloc/devm_kfree. Fixes: 2ef4e45d99b1 ("octeontx2-af: Add PTP PPS Errata workaround on CN10K silicon") Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: Sai Krishna <saikrishnag@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-07-07octeontx2-af: Promisc enable/disable through mboxRatheesh Kannoth2-11/+23
In legacy silicon, promiscuous mode is only modified through CGX mbox messages. In CN10KB silicon, it is modified from CGX mbox and NIX. This breaks legacy application behaviour. Fix this by removing call from NIX. Fixes: d6c9784baf59 ("octeontx2-af: Invoke exact match functions if supported") Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-07-07Merge branch '1GbE' of ↵David S. Miller5-32/+118
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2023-07-05 (igc) This series contains updates to igc driver only. Husaini adds check to increment Qbv change error counter only on taprio Qbvs. He also removes delay during Tx ring configuration and resolves Tx hang that could occur when transmitting on a gate to be closed. Prasad Koya reports ethtool link mode as TP (twisted pair). Tee Min corrects value for max SDU. Aravindhan ensures that registers for PPS are always programmed to occur in future. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-07-07gve: Set default duplex configuration to fullJunfeng Guo1-0/+3
Current duplex mode was unset in the driver, resulting in the default parameter being set to 0, which corresponds to half duplex. It might mislead users to have incorrect expectation about the driver's transmission capabilities. Set the default duplex configuration to full, as the driver runs in full duplex mode at this point. Fixes: 7e074d5a76ca ("gve: Enable Link Speed Reporting in the driver.") Signed-off-by: Junfeng Guo <junfeng.guo@intel.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Message-ID: <20230706044128.2726747-1-junfeng.guo@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-07-07Merge branch '100GbE' of ↵Jakub Kicinski3-19/+27
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2023-07-05 (ice) This series contains updates to ice driver only. Sridhar fixes incorrect comparison of max Tx rate limit to occur against each TC value rather than the aggregate. He also resolves an issue with the wrong VSI being used when setting max Tx rate when TCs are enabled. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: ice: Fix tx queue rate limit when TCs are configured ice: Fix max_rate check while configuring TX rate limits ==================== Link: https://lore.kernel.org/r/20230705201346.49370-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-07-07Merge tag 'mlx5-fixes-2023-07-05' of ↵Jakub Kicinski10-42/+61
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5 fixes 2023-07-05 This series provides bug fixes to mlx5 driver. * tag 'mlx5-fixes-2023-07-05' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5e: RX, Fix page_pool page fragment tracking for XDP net/mlx5: Query hca_cap_2 only when supported net/mlx5e: TC, CT: Offload ct clear only once net/mlx5e: Check for NOT_READY flag state after locking net/mlx5: Register a unique thermal zone per device net/mlx5e: RX, Fix flush and close release flow of regular rq for legacy rq net/mlx5e: fix memory leak in mlx5e_ptp_open net/mlx5e: fix memory leak in mlx5e_fs_tt_redirect_any_create net/mlx5e: fix double free in mlx5e_destroy_flow_table ==================== Link: https://lore.kernel.org/r/20230705175757.284614-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-07-07wifi: mt76: mt7921e: fix init command fail with enabled deviceQuan Zhou3-12/+8
For some cases as below, we may encounter the unpreditable chip stats in driver probe() * The system reboot flow do not work properly, such as kernel oops while rebooting, and then the driver do not go back to default status at this moment. * Similar to the flow above. If the device was enabled in BIOS or UEFI, the system may switch to Linux without driver fully shutdown. To avoid the problem, force push the device back to default in probe() * mt7921e_mcu_fw_pmctrl() : return control privilege to chip side. * mt7921_wfsys_reset() : cleanup chip config before resource init. Error log [59007.600714] mt7921e 0000:02:00.0: ASIC revision: 79220010 [59010.889773] mt7921e 0000:02:00.0: Message 00000010 (seq 1) timeout [59010.889786] mt7921e 0000:02:00.0: Failed to get patch semaphore [59014.217839] mt7921e 0000:02:00.0: Message 00000010 (seq 2) timeout [59014.217852] mt7921e 0000:02:00.0: Failed to get patch semaphore [59017.545880] mt7921e 0000:02:00.0: Message 00000010 (seq 3) timeout [59017.545893] mt7921e 0000:02:00.0: Failed to get patch semaphore [59020.874086] mt7921e 0000:02:00.0: Message 00000010 (seq 4) timeout [59020.874099] mt7921e 0000:02:00.0: Failed to get patch semaphore [59024.202019] mt7921e 0000:02:00.0: Message 00000010 (seq 5) timeout [59024.202033] mt7921e 0000:02:00.0: Failed to get patch semaphore [59027.530082] mt7921e 0000:02:00.0: Message 00000010 (seq 6) timeout [59027.530096] mt7921e 0000:02:00.0: Failed to get patch semaphore [59030.857888] mt7921e 0000:02:00.0: Message 00000010 (seq 7) timeout [59030.857904] mt7921e 0000:02:00.0: Failed to get patch semaphore [59034.185946] mt7921e 0000:02:00.0: Message 00000010 (seq 8) timeout [59034.185961] mt7921e 0000:02:00.0: Failed to get patch semaphore [59037.514249] mt7921e 0000:02:00.0: Message 00000010 (seq 9) timeout [59037.514262] mt7921e 0000:02:00.0: Failed to get patch semaphore [59040.842362] mt7921e 0000:02:00.0: Message 00000010 (seq 10) timeout [59040.842375] mt7921e 0000:02:00.0: Failed to get patch semaphore [59040.923845] mt7921e 0000:02:00.0: hardware init failed Cc: stable@vger.kernel.org Fixes: 5c14a5f944b9 ("mt76: mt7921: introduce mt7921e support") Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Tested-by: Juan Martinez <juan.martinez@amd.com> Co-developed-by: Leon Yen <leon.yen@mediatek.com> Signed-off-by: Leon Yen <leon.yen@mediatek.com> Signed-off-by: Quan Zhou <quan.zhou@mediatek.com> Signed-off-by: Deren Wu <deren.wu@mediatek.com> Message-ID: <39fcb7cee08d4ab940d38d82f21897483212483f.1688569385.git.deren.wu@mediatek.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-07-07net: mscc: ocelot: fix oversize frame dropping for preemptible TCsVladimir Oltean2-6/+22
This switch implements Hold/Release in a strange way, with no control from the user as required by IEEE 802.1Q-2018 through Set-And-Hold-MAC and Set-And-Release-MAC, but rather, it emits HOLD requests implicitly based on the schedule. Namely, when the gate of a preemptible TC is about to close (actually QSYS::PREEMPTION_CFG.HOLD_ADVANCE octet times in advance of this event), the QSYS seems to emit a HOLD request pulse towards the MAC which preempts the currently transmitted packet, and further packets are held back in the queue system. This allows large frames to be squeezed through small time slots, because HOLD requests initiated by the gate events result in the frame being segmented in multiple fragments, the bit time of which is equal to the size of the time slot. It has been reported that the vsc9959_tas_guard_bands_update() logic breaks this, because it doesn't take preemptible TCs into account, and enables oversized frame dropping when the time slot doesn't allow a full MTU to be sent, but it does allow 2*minFragSize to be sent (128B). Packets larger than 128B are dropped instead of being sent in multiple fragments. Confusingly, the manual says: | For guard band, SDU calculation of a traffic class of a port, if | preemption is enabled (through 'QSYS::PREEMPTION_CFG.P_QUEUES') then | QSYS::PREEMPTION_CFG.HOLD_ADVANCE is used, otherwise | QSYS::QMAXSDU_CFG_*.QMAXSDU_* is used. but this only refers to the static guard band durations, and the QMAXSDU_CFG_* registers have dual purpose - the other being oversized frame dropping, which takes place irrespective of whether frames are preemptible or express. So, to fix the problem, we need to call vsc9959_tas_guard_bands_update() from ocelot_port_update_active_preemptible_tcs(), and modify the guard band logic to consider a different (lower) oversize limit for preemptible traffic classes. Fixes: 403ffc2c34de ("net: mscc: ocelot: add support for preemptible traffic classes") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Message-ID: <20230705104422.49025-4-vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-07-07net: dsa: felix: make vsc9959_tas_guard_bands_update() visible to ocelot->opsVladimir Oltean3-5/+3
In a future change we will need to make ocelot_port_update_active_preemptible_tcs() call vsc9959_tas_guard_bands_update(), but that is currently not possible, since the ocelot switch lib does not have access to functions private to the DSA wrapper. Move the pointer to vsc9959_tas_guard_bands_update() from felix->info (which is private to the DSA driver) to ocelot->ops (which is also visible to the ocelot switch lib). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Message-ID: <20230705104422.49025-3-vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-07-07net: mscc: ocelot: extend ocelot->fwd_domain_lock to cover ocelot->tas_lockVladimir Oltean4-22/+26
In a future commit we will have to call vsc9959_tas_guard_bands_update() from ocelot_port_update_active_preemptible_tcs(), and that will be impossible due to the AB/BA locking dependencies between ocelot->tas_lock and ocelot->fwd_domain_lock. Just like we did in commit 3ff468ef987e ("net: mscc: ocelot: remove struct ocelot_mm_state :: lock"), the only solution is to expand the scope of ocelot->fwd_domain_lock for it to also serialize changes made to the Time-Aware Shaper, because those will have to result in a recalculation of cut-through TCs, which is something that depends on the forwarding domain. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Message-ID: <20230705104422.49025-2-vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-07-06Merge tag 'asm-generic-6.5' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull asm-generic updates from Arnd Bergmann: "These are cleanups for architecture specific header files: - the comments in include/linux/syscalls.h have gone out of sync and are really pointless, so these get removed - The asm/bitsperlong.h header no longer needs to be architecture specific on modern compilers, so use a generic version for newer architectures that use new enough userspace compilers - A cleanup for virt_to_pfn/virt_to_bus to have proper type checking, forcing the use of pointers" * tag 'asm-generic-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: syscalls: Remove file path comments from headers tools arch: Remove uapi bitsperlong.h of hexagon and microblaze asm-generic: Unify uapi bitsperlong.h for arm64, riscv and loongarch m68k/mm: Make pfn accessors static inlines arm64: memory: Make virt_to_pfn() a static inline ARM: mm: Make virt_to_pfn() a static inline asm-generic/page.h: Make pfn accessors static inlines xen/netback: Pass (void *) to virt_to_page() netfs: Pass a pointer to virt_to_page() cifs: Pass a pointer to virt_to_page() in cifsglob cifs: Pass a pointer to virt_to_page() riscv: mm: init: Pass a pointer to virt_to_page() ARC: init: Pass a pointer to virt_to_pfn() in init m68k: Pass a pointer to virt_to_pfn() virt_to_page() fs/proc/kcore.c: Pass a pointer to virt_addr_valid()
2023-07-06net: mvneta: fix txq_map in case of txq_number==1Klaus Kudielka1-2/+2
If we boot with mvneta.txq_number=1, the txq_map is set incorrectly: MVNETA_CPU_TXQ_ACCESS(1) refers to TX queue 1, but only TX queue 0 is initialized. Fix this. Fixes: 50bf8cb6fc9c ("net: mvneta: Configure XPS support") Signed-off-by: Klaus Kudielka <klaus.kudielka@gmail.com> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Link: https://lore.kernel.org/r/20230705053712.3914-1-klaus.kudielka@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-07-06Merge tag 'net-6.5-rc1' of ↵Linus Torvalds30-138/+256
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bluetooth, bpf and wireguard. Current release - regressions: - nvme-tcp: fix comma-related oops after sendpage changes Current release - new code bugs: - ptp: make max_phase_adjustment sysfs device attribute invisible when not supported Previous releases - regressions: - sctp: fix potential deadlock on &net->sctp.addr_wq_lock - mptcp: - ensure subflow is unhashed before cleaning the backlog - do not rely on implicit state check in mptcp_listen() Previous releases - always broken: - net: fix net_dev_start_xmit trace event vs skb_transport_offset() - Bluetooth: - fix use-bdaddr-property quirk - L2CAP: fix multiple UaFs - ISO: use hci_sync for setting CIG parameters - hci_event: fix Set CIG Parameters error status handling - hci_event: fix parsing of CIS Established Event - MGMT: fix marking SCAN_RSP as not connectable - wireguard: queuing: use saner cpu selection wrapping - sched: act_ipt: various bug fixes for iptables <> TC interactions - sched: act_pedit: add size check for TCA_PEDIT_PARMS_EX - dsa: fixes for receiving PTP packets with 8021q and sja1105 tagging - eth: sfc: fix null-deref in devlink port without MAE access - eth: ibmvnic: do not reset dql stats on NON_FATAL err Misc: - xsk: honor SO_BINDTODEVICE on bind" * tag 'net-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (70 commits) nfp: clean mc addresses in application firmware when closing port selftests: mptcp: pm_nl_ctl: fix 32-bit support selftests: mptcp: depend on SYN_COOKIES selftests: mptcp: userspace_pm: report errors with 'remove' tests selftests: mptcp: userspace_pm: use correct server port selftests: mptcp: sockopt: return error if wrong mark selftests: mptcp: sockopt: use 'iptables-legacy' if available selftests: mptcp: connect: fail if nft supposed to work mptcp: do not rely on implicit state check in mptcp_listen() mptcp: ensure subflow is unhashed before cleaning the backlog s390/qeth: Fix vipa deletion octeontx-af: fix hardware timestamp configuration net: dsa: sja1105: always enable the send_meta options net: dsa: tag_sja1105: fix MAC DA patching from meta frames net: Replace strlcpy with strscpy pptp: Fix fib lookup calls. mlxsw: spectrum_router: Fix an IS_ERR() vs NULL check net/sched: act_pedit: Add size check for TCA_PEDIT_PARMS_EX xsk: Honor SO_BINDTODEVICE on bind ptp: Make max_phase_adjustment sysfs device attribute invisible when not supported ...
2023-07-05igc: Handle PPS start time programming for past time valuesAravindhan Gunasekaran1-3/+22
I225/6 hardware can be programmed to start PPS output once the time in Target Time registers is reached. The time programmed in these registers should always be into future. Only then PPS output is triggered when SYSTIM register reaches the programmed value. There are two modes in i225/6 hardware to program PPS, pulse and clock mode. There were issues reported where PPS is not generated when start time is in past. Example 1, "echo 0 0 0 2 0 > /sys/class/ptp/ptp0/period" In the current implementation, a value of '0' is programmed into Target time registers and PPS output is in pulse mode. Eventually an interrupt which is triggered upon SYSTIM register reaching Target time is not fired. Thus no PPS output is generated. Example 2, "echo 0 0 0 1 0 > /sys/class/ptp/ptp0/period" Above case, a value of '0' is programmed into Target time registers and PPS output is in clock mode. Here, HW tries to catch-up the current time by incrementing Target Time register. This catch-up time seem to vary according to programmed PPS period time as per the HW design. In my experiments, the delay ranged between few tens of seconds to few minutes. The PPS output is only generated after the Target time register reaches current time. In my experiments, I also observed PPS stopped working with below test and could not recover until module is removed and loaded again. 1) echo 0 <future time> 0 1 0 > /sys/class/ptp/ptp1/period 2) echo 0 0 0 1 0 > /sys/class/ptp/ptp1/period 3) echo 0 0 0 1 0 > /sys/class/ptp/ptp1/period After this PPS did not work even if i re-program with proper values. I could only get this back working by reloading the driver. This patch takes care of calculating and programming appropriate future time value into Target Time registers. Fixes: 5e91c72e560c ("igc: Fix PPS delta between two synchronized end-points") Signed-off-by: Aravindhan Gunasekaran <aravindhan.gunasekaran@intel.com> Reviewed-by: Muhammad Husaini Zulkifli <muhammad.husaini.zulkifli@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-07-05igc: Include the length/type field and VLAN tag in queueMaxSDUTan Tee Min1-11/+4
IEEE 802.1Q does not have clear definitions of what constitutes an SDU (Service Data Unit), but IEEE Std 802.3 clause 3.1.2 does define the MAC service primitives and clause 3.2.7 does define the MAC Client Data for Q-tagged frames. It shows that the mac_service_data_unit (MSDU) does NOT contain the preamble, destination and source address, or FCS. The MSDU does contain the length/type field, MAC client data, VLAN tag and any padding data (prior to the FCS). Thus, the maximum 802.3 frame size that is allowed to be transmitted should be QueueMaxSDU (MSDU) + 16 (6 byte SA + 6 byte DA + 4 byte FCS). Fixes: 92a0dcb8427d ("igc: offload queue max SDU from tc-taprio") Signed-off-by: Tan Tee Min <tee.min.tan@linux.intel.com> Reviewed-by: Muhammad Husaini Zulkifli <muhammad.husaini.zulkifli@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-07-05igc: set TP bit in 'supported' and 'advertising' fields of ↵Prasad Koya1-0/+2
ethtool_link_ksettings set TP bit in the 'supported' and 'advertising' fields. i225/226 parts only support twisted pair copper. Fixes: 8c5ad0dae93c ("igc: Add ethtool support") Signed-off-by: Prasad Koya <prasad@arista.com> Acked-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-07-05nfp: clean mc addresses in application firmware when closing portYinjun Zhang1-0/+5
When moving devices from one namespace to another, mc addresses are cleaned in software while not removed from application firmware. Thus the mc addresses are remained and will cause resource leak. Now use `__dev_mc_unsync` to clean mc addresses when closing port. Fixes: e20aa071cd95 ("nfp: fix schedule in atomic context when sync mc address") Cc: stable@vger.kernel.org Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com> Acked-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Louis Peens <louis.peens@corigine.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Message-ID: <20230705052818.7122-1-louis.peens@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-07-05net/mlx5e: RX, Fix page_pool page fragment tracking for XDPDragos Tatulea2-22/+13
Currently mlx5e releases pages directly to the page_pool for XDP_TX and does page fragment counting for XDP_REDIRECT. RX pages from the page_pool are leaking on XDP_REDIRECT because the xdp core will release only one fragment out of MLX5E_PAGECNT_BIAS_MAX and subsequently the page is marked as "skip release" which avoids the driver release. A fix would be to take an extra fragment for XDP_REDIRECT and not set the "skip release" bit so that the release on the driver side can handle the remaining bias fragments. But this would be a shortsighted solution. Instead, this patch converges the two XDP paths (XDP_TX and XDP_REDIRECT) to always do fragment tracking. The "skip release" bit is no longer necessary for XDP. Fixes: 6f5742846053 ("net/mlx5e: RX, Enable skb page recycling through the page_pool") Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-07-05net/mlx5: Query hca_cap_2 only when supportedMaher Sanalla1-0/+3
On vport enable, where fw's hca caps are queried, the driver queries hca_caps_2 without checking if fw truly supports them, causing a false failure of vfs vport load and blocking SRIOV enablement on old devices such as CX4 where hca_caps_2 support is missing. Thus, add a check for the said caps support before accessing them. Fixes: e5b9642a33be ("net/mlx5: E-Switch, Implement devlink port function cmds to control migratable") Signed-off-by: Maher Sanalla <msanalla@nvidia.com> Reviewed-by: Shay Drory <shayd@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>