summaryrefslogtreecommitdiff
path: root/drivers/infiniband/hw
AgeCommit message (Collapse)AuthorFilesLines
2024-06-21RDMA/mana_ib: Ignore optional access flags for MRsKonstantin Taranov1-0/+1
Ignore optional ib_access_flags when an MR is created. Fixes: 0266a177631d ("RDMA/mana_ib: Add a driver for Microsoft Azure Network Adapter") Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Link: https://lore.kernel.org/r/1717575368-14879-1-git-send-email-kotaranov@linux.microsoft.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-06-21RDMA/mlx5: Add check for srq max_sge attributePatrisious Haddad1-5/+8
max_sge attribute is passed by the user, and is inserted and used unchecked, so verify that the value doesn't exceed maximum allowed value before using it. Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Link: https://lore.kernel.org/r/277ccc29e8d57bfd53ddeb2ac633f2760cf8cdd0.1716900410.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-06-21RDMA/mlx5: Fix unwind flow as part of mlx5_ib_stage_init_initYishai Hadas1-2/+2
Fix unwind flow as part of mlx5_ib_stage_init_init to use the correct goto upon an error. Fixes: 758ce14aee82 ("RDMA/mlx5: Implement MACsec gid addition and deletion") Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Reviewed-by: Patrisious Haddad <phaddad@nvidia.com> Link: https://lore.kernel.org/r/aa40615116eda14ec9eca21d52017d632ea89188.1716900410.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-06-21RDMA/mlx5: Ensure created mkeys always have a populated rb_keyJason Gunthorpe1-1/+2
cachable and mmkey.rb_key together are used by mlx5_revoke_mr() to put the MR/mkey back into the cache. In all cases they should be set correctly. alloc_cacheable_mr() was setting cachable but not filling rb_key, resulting in cache_ent_find_and_store() bucketing them all into a 0 length entry. implicit_get_child_mr()/mlx5_ib_alloc_implicit_mr() failed to set cachable or rb_key at all, so the cache was not working at all for implicit ODP. Cc: stable@vger.kernel.org Fixes: 8c1185fef68c ("RDMA/mlx5: Change check for cacheable mkeys") Fixes: dd1b913fb0d0 ("RDMA/mlx5: Cache all user cacheable mkeys on dereg MR flow") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/7778c02dfa0999a30d6746c79a23dd7140a9c729.1716900410.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-06-21RDMA/mlx5: Follow rb_key.ats when creating new mkeysJason Gunthorpe1-0/+1
When a cache ent already exists but doesn't have any mkeys in it the cache will automatically create a new one based on the specification in the ent->rb_key. ent->ats was missed when creating the new key and so ma_translation_mode was not being set even though the ent requires it. Cc: stable@vger.kernel.org Fixes: 73d09b2fe833 ("RDMA/mlx5: Introduce mlx5r_cache_rb_key") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Michael Guralnik <michaelgur@nvidia.com> Link: https://lore.kernel.org/r/7c5613458ecb89fbe5606b7aa4c8d990bdea5b9a.1716900410.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-06-21RDMA/mlx5: Remove extra unlock on error pathJason Gunthorpe1-3/+1
The below commit lifted the locking out of this function but left this error path unlock behind resulting in unbalanced locking. Remove the missed unlock too. Cc: stable@vger.kernel.org Fixes: 627122280c87 ("RDMA/mlx5: Add work to remove temporary entries from the cache") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Michael Guralnik <michaelgur@nvidia.com> Link: https://lore.kernel.org/r/78090c210c750f47219b95248f9f782f34548bb1.1716900410.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-05-30RDMA/bnxt_re: Fix the max msix vectors macroSelvin Xavier1-3/+1
bnxt_re no longer decide the number of MSI-x vectors used by itself. Its decided by bnxt_en now. So when bnxt_en changes this value, system crash is seen. Depend on the max value reported by bnxt_en instead of using the its own macros. Fixes: 303432211324 ("bnxt_en: Remove runtime interrupt vector allocation") Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://lore.kernel.org/r/1716195418-11767-1-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-05-23tracing/treewide: Remove second parameter of __assign_str()Steven Rostedt (Google)5-7/+7
With the rework of how the __string() handles dynamic strings where it saves off the source string in field in the helper structure[1], the assignment of that value to the trace event field is stored in the helper value and does not need to be passed in again. This means that with: __string(field, mystring) Which use to be assigned with __assign_str(field, mystring), no longer needs the second parameter and it is unused. With this, __assign_str() will now only get a single parameter. There's over 700 users of __assign_str() and because coccinelle does not handle the TRACE_EVENT() macro I ended up using the following sed script: git grep -l __assign_str | while read a ; do sed -e 's/\(__assign_str([^,]*[^ ,]\) *,[^;]*/\1)/' $a > /tmp/test-file; mv /tmp/test-file $a; done I then searched for __assign_str() that did not end with ';' as those were multi line assignments that the sed script above would fail to catch. Note, the same updates will need to be done for: __assign_str_len() __assign_rel_str() __assign_rel_str_len() I tested this with both an allmodconfig and an allyesconfig (build only for both). [1] https://lore.kernel.org/linux-trace-kernel/20240222211442.634192653@goodmis.org/ Link: https://lore.kernel.org/linux-trace-kernel/20240516133454.681ba6a0@rorschach.local.home Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Julia Lawall <Julia.Lawall@inria.fr> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Acked-by: Jani Nikula <jani.nikula@intel.com> Acked-by: Christian König <christian.koenig@amd.com> for the amdgpu parts. Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> #for Acked-by: Rafael J. Wysocki <rafael@kernel.org> # for thermal Acked-by: Takashi Iwai <tiwai@suse.de> Acked-by: Darrick J. Wong <djwong@kernel.org> # xfs Tested-by: Guenter Roeck <linux@roeck-us.net>
2024-05-22Merge tag 'driver-core-6.10-rc1' of ↵Linus Torvalds3-15/+2
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the small set of driver core and kernfs changes for 6.10-rc1. Nothing major here at all, just a small set of changes for some driver core apis, and minor fixups. Included in here are: - sysfs_bin_attr_simple_read() helper added and used - device_show_string() helper added and used All usages of these were acked by the various maintainers. Also in here are: - kernfs minor cleanup - removed unused functions - typo fix in documentation - pay attention to sysfs_create_link() failures in module.c finally All of these have been in linux-next for a very long time with no reported problems" * tag 'driver-core-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: device property: Fix a typo in the description of device_get_child_node_count() kernfs: mount: Remove unnecessary ‘NULL’ values from knparent scsi: Use device_show_string() helper for sysfs attributes platform/x86: Use device_show_string() helper for sysfs attributes perf: Use device_show_string() helper for sysfs attributes IB/qib: Use device_show_string() helper for sysfs attributes hwmon: Use device_show_string() helper for sysfs attributes driver core: Add device_show_string() helper for sysfs attributes treewide: Use sysfs_bin_attr_simple_read() helper sysfs: Add sysfs_bin_attr_simple_read() helper module: don't ignore sysfs_create_link() failures driver core: Remove unused platform_notify, platform_notify_remove
2024-05-21Merge tag 'pci-v6.10-changes' of ↵Linus Torvalds4-6/+5
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull pci updates from Bjorn Helgaas: "Enumeration: - Skip E820 checks for MCFG ECAM regions for new (2016+) machines, since there's no requirement to describe them in E820 and some platforms require ECAM to work (Bjorn Helgaas) - Rename PCI_IRQ_LEGACY to PCI_IRQ_INTX to be more specific (Damien Le Moal) - Remove last user and pci_enable_device_io() (Heiner Kallweit) - Wait for Link Training==0 to avoid possible race (Ilpo Järvinen) - Skip waiting for devices that have been disconnected while suspended (Ilpo Järvinen) - Clear Secondary Status errors after enumeration since Master Aborts and Unsupported Request errors are an expected part of enumeration (Vidya Sagar) MSI: - Remove unused IMS (Interrupt Message Store) support (Bjorn Helgaas) Error handling: - Mask Genesys GL975x SD host controller Replay Timer Timeout correctable errors caused by a hardware defect; the errors cause interrupts that prevent system suspend (Kai-Heng Feng) - Fix EDR-related _DSM support, which previously evaluated revision 5 but assumed revision 6 behavior (Kuppuswamy Sathyanarayanan) ASPM: - Simplify link state definitions and mask calculation (Ilpo Järvinen) Power management: - Avoid D3cold for HP Pavilion 17 PC/1972 PCIe Ports, where BIOS apparently doesn't know how to put them back in D0 (Mario Limonciello) CXL: - Support resetting CXL devices; special handling required because CXL Ports mask Secondary Bus Reset by default (Dave Jiang) DOE: - Support DOE Discovery Version 2 (Alexey Kardashevskiy) Endpoint framework: - Set endpoint BAR to be 64-bit if the driver says that's all the device supports, in addition to doing so if the size is >2GB (Niklas Cassel) - Simplify endpoint BAR allocation and setting interfaces (Niklas Cassel) Cadence PCIe controller driver: - Drop DT binding redundant msi-parent and pci-bus.yaml (Krzysztof Kozlowski) Cadence PCIe endpoint driver: - Configure endpoint BARs to be 64-bit based on the BAR type, not the BAR value (Niklas Cassel) Freescale Layerscape PCIe controller driver: - Convert DT binding to YAML (Frank Li) MediaTek MT7621 PCIe controller driver: - Add DT binding missing 'reg' property for child Root Ports (Krzysztof Kozlowski) - Fix theoretical string truncation in PHY name (Sergio Paracuellos) NVIDIA Tegra194 PCIe controller driver: - Return success for endpoint probe instead of falling through to the failure path (Vidya Sagar) Renesas R-Car PCIe controller driver: - Add DT binding missing IOMMU properties (Geert Uytterhoeven) - Add DT binding R-Car V4H compatible for host and endpoint mode (Yoshihiro Shimoda) Rockchip PCIe controller driver: - Configure endpoint BARs to be 64-bit based on the BAR type, not the BAR value (Niklas Cassel) - Add DT binding missing maxItems to ep-gpios (Krzysztof Kozlowski) - Set the Subsystem Vendor ID, which was previously zero because it was masked incorrectly (Rick Wertenbroek) Synopsys DesignWare PCIe controller driver: - Restructure DBI register access to accommodate devices where this requires Refclk to be active (Manivannan Sadhasivam) - Remove the deinit() callback, which was only need by the pcie-rcar-gen4, and do it directly in that driver (Manivannan Sadhasivam) - Add dw_pcie_ep_cleanup() so drivers that support PERST# can clean up things like eDMA (Manivannan Sadhasivam) - Rename dw_pcie_ep_exit() to dw_pcie_ep_deinit() to make it parallel to dw_pcie_ep_init() (Manivannan Sadhasivam) - Rename dw_pcie_ep_init_complete() to dw_pcie_ep_init_registers() to reflect the actual functionality (Manivannan Sadhasivam) - Call dw_pcie_ep_init_registers() directly from all the glue drivers, not just those that require active Refclk from the host (Manivannan Sadhasivam) - Remove the "core_init_notifier" flag, which was an obscure way for glue drivers to indicate that they depend on Refclk from the host (Manivannan Sadhasivam) TI J721E PCIe driver: - Add DT binding J784S4 SoC Device ID (Siddharth Vadapalli) - Add DT binding J722S SoC support (Siddharth Vadapalli) TI Keystone PCIe controller driver: - Add DT binding missing num-viewport, phys and phy-name properties (Jan Kiszka) Miscellaneous: - Constify and annotate with __ro_after_init (Heiner Kallweit) - Convert DT bindings to YAML (Krzysztof Kozlowski) - Check for kcalloc() failure in of_pci_prop_intr_map() (Duoming Zhou)" * tag 'pci-v6.10-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (97 commits) PCI: Do not wait for disconnected devices when resuming x86/pci: Skip early E820 check for ECAM region PCI: Remove unused pci_enable_device_io() ata: pata_cs5520: Remove unnecessary call to pci_enable_device_io() PCI: Update pci_find_capability() stub return types PCI: Remove PCI_IRQ_LEGACY scsi: vmw_pvscsi: Do not use PCI_IRQ_LEGACY instead of PCI_IRQ_LEGACY scsi: pmcraid: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY scsi: mpt3sas: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY scsi: megaraid_sas: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY scsi: ipr: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY scsi: hpsa: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY scsi: arcmsr: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY wifi: rtw89: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY dt-bindings: PCI: rockchip,rk3399-pcie: Add missing maxItems to ep-gpios Revert "genirq/msi: Provide constants for PCI/IMS support" Revert "x86/apic/msi: Enable PCI/IMS" Revert "iommu/vt-d: Enable PCI/IMS" Revert "iommu/amd: Enable PCI/IMS" Revert "PCI/MSI: Provide IMS (Interrupt Message Store) support" ...
2024-05-18Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds42-488/+1154
Pull rdma updates from Jason Gunthorpe: "Aside from the usual things this has an arch update for __iowrite64_copy() used by the RDMA drivers. This API was intended to generate large 64 byte MemWr TLPs on PCI. These days most processors had done this by just repeating writel() in a loop. S390 and some new ARM64 designs require a special helper to get this to generate. - Small improvements and fixes for erdma, efa, hfi1, bnxt_re - Fix a UAF crash after module unload on leaking restrack entry - Continue adding full RDMA support in mana with support for EQs, GID's and CQs - Improvements to the mkey cache in mlx5 - DSCP traffic class support in hns and several bug fixes - Cap the maximum number of MADs in the receive queue to avoid OOM - Another batch of rxe bug fixes from large scale testing - __iowrite64_copy() optimizations for write combining MMIO memory - Remove NULL checks before dev_put/hold() - EFA support for receive with immediate - Fix a recent memleaking regression in a cma error path" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (70 commits) RDMA/cma: Fix kmemleak in rdma_core observed during blktests nvme/rdma use siw RDMA/IPoIB: Fix format truncation compilation errors bnxt_re: avoid shift undefined behavior in bnxt_qplib_alloc_init_hwq RDMA/efa: Support QP with unsolicited write w/ imm. receive IB/hfi1: Remove generic .ndo_get_stats64 IB/hfi1: Do not use custom stat allocator RDMA/hfi1: Use RMW accessors for changing LNKCTL2 RDMA/mana_ib: implement uapi for creation of rnic cq RDMA/mana_ib: boundary check before installing cq callbacks RDMA/mana_ib: introduce a helper to remove cq callbacks RDMA/mana_ib: create and destroy RNIC cqs RDMA/mana_ib: create EQs for RNIC CQs RDMA/core: Remove NULL check before dev_{put, hold} RDMA/ipoib: Remove NULL check before dev_{put, hold} RDMA/mlx5: Remove NULL check before dev_{put, hold} RDMA/mlx5: Track DCT, DCI and REG_UMR QPs as diver_detail resources. RDMA/core: Add an option to display driver-specific QPs in the rdmatool RDMA/efa: Add shutdown notifier RDMA/mana_ib: Fix missing ret value IB/mlx5: Use __iowrite64_copy() for write combining stores ...
2024-05-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-0/+1
Cross-merge networking fixes after downstream PR. No conflicts. Adjacent changes: drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c 35d92abfbad8 ("net: hns3: fix kernel crash when devlink reload during initialization") 2a1a1a7b5fd7 ("net: hns3: add command queue trace for hns3") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-09Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds1-0/+1
Pull dentry leak fix from Al Viro: "Dentry leak fix in the qibfs driver that I forgot to send a pull request for ;-/ My apologies - it actually sat in vfs.git#fixes for more than two months..." * tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: qibfs: fix dentry leak
2024-05-09bnxt_re: avoid shift undefined behavior in bnxt_qplib_alloc_init_hwqMichal Schmidt1-1/+2
Undefined behavior is triggered when bnxt_qplib_alloc_init_hwq is called with hwq_attr->aux_depth != 0 and hwq_attr->aux_stride == 0. In that case, "roundup_pow_of_two(hwq_attr->aux_stride)" gets called. roundup_pow_of_two is documented as undefined for 0. Fix it in the one caller that had this combination. The undefined behavior was detected by UBSAN: UBSAN: shift-out-of-bounds in ./include/linux/log2.h:57:13 shift exponent 64 is too large for 64-bit type 'long unsigned int' CPU: 24 PID: 1075 Comm: (udev-worker) Not tainted 6.9.0-rc6+ #4 Hardware name: Abacus electric, s.r.o. - servis@abacus.cz Super Server/H12SSW-iN, BIOS 2.7 10/25/2023 Call Trace: <TASK> dump_stack_lvl+0x5d/0x80 ubsan_epilogue+0x5/0x30 __ubsan_handle_shift_out_of_bounds.cold+0x61/0xec __roundup_pow_of_two+0x25/0x35 [bnxt_re] bnxt_qplib_alloc_init_hwq+0xa1/0x470 [bnxt_re] bnxt_qplib_create_qp+0x19e/0x840 [bnxt_re] bnxt_re_create_qp+0x9b1/0xcd0 [bnxt_re] ? srso_alias_return_thunk+0x5/0xfbef5 ? srso_alias_return_thunk+0x5/0xfbef5 ? __kmalloc+0x1b6/0x4f0 ? create_qp.part.0+0x128/0x1c0 [ib_core] ? __pfx_bnxt_re_create_qp+0x10/0x10 [bnxt_re] create_qp.part.0+0x128/0x1c0 [ib_core] ib_create_qp_kernel+0x50/0xd0 [ib_core] create_mad_qp+0x8e/0xe0 [ib_core] ? __pfx_qp_event_handler+0x10/0x10 [ib_core] ib_mad_init_device+0x2be/0x680 [ib_core] add_client_context+0x10d/0x1a0 [ib_core] enable_device_and_get+0xe0/0x1d0 [ib_core] ib_register_device+0x53c/0x630 [ib_core] ? srso_alias_return_thunk+0x5/0xfbef5 bnxt_re_probe+0xbd8/0xe50 [bnxt_re] ? __pfx_bnxt_re_probe+0x10/0x10 [bnxt_re] auxiliary_bus_probe+0x49/0x80 ? driver_sysfs_add+0x57/0xc0 really_probe+0xde/0x340 ? pm_runtime_barrier+0x54/0x90 ? __pfx___driver_attach+0x10/0x10 __driver_probe_device+0x78/0x110 driver_probe_device+0x1f/0xa0 __driver_attach+0xba/0x1c0 bus_for_each_dev+0x8f/0xe0 bus_add_driver+0x146/0x220 driver_register+0x72/0xd0 __auxiliary_driver_register+0x6e/0xd0 ? __pfx_bnxt_re_mod_init+0x10/0x10 [bnxt_re] bnxt_re_mod_init+0x3e/0xff0 [bnxt_re] ? __pfx_bnxt_re_mod_init+0x10/0x10 [bnxt_re] do_one_initcall+0x5b/0x310 do_init_module+0x90/0x250 init_module_from_file+0x86/0xc0 idempotent_init_module+0x121/0x2b0 __x64_sys_finit_module+0x5e/0xb0 do_syscall_64+0x82/0x160 ? srso_alias_return_thunk+0x5/0xfbef5 ? syscall_exit_to_user_mode_prepare+0x149/0x170 ? srso_alias_return_thunk+0x5/0xfbef5 ? syscall_exit_to_user_mode+0x75/0x230 ? srso_alias_return_thunk+0x5/0xfbef5 ? do_syscall_64+0x8e/0x160 ? srso_alias_return_thunk+0x5/0xfbef5 ? __count_memcg_events+0x69/0x100 ? srso_alias_return_thunk+0x5/0xfbef5 ? count_memcg_events.constprop.0+0x1a/0x30 ? srso_alias_return_thunk+0x5/0xfbef5 ? handle_mm_fault+0x1f0/0x300 ? srso_alias_return_thunk+0x5/0xfbef5 ? do_user_addr_fault+0x34e/0x640 ? srso_alias_return_thunk+0x5/0xfbef5 ? srso_alias_return_thunk+0x5/0xfbef5 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7f4e5132821d Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d e3 db 0c 00 f7 d8 64 89 01 48 RSP: 002b:00007ffca9c906a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 RAX: ffffffffffffffda RBX: 0000563ec8a8f130 RCX: 00007f4e5132821d RDX: 0000000000000000 RSI: 00007f4e518fa07d RDI: 000000000000003b RBP: 00007ffca9c90760 R08: 00007f4e513f6b20 R09: 00007ffca9c906f0 R10: 0000563ec8a8faa0 R11: 0000000000000246 R12: 00007f4e518fa07d R13: 0000000000020000 R14: 0000563ec8409e90 R15: 0000563ec8a8fa60 </TASK> ---[ end trace ]--- Fixes: 0c4dcd602817 ("RDMA/bnxt_re: Refactor hardware queue memory allocation") Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Link: https://lore.kernel.org/r/20240507103929.30003-1-mschmidt@redhat.com Acked-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-05-08RDMA/efa: Support QP with unsolicited write w/ imm. receiveMichael Margolin4-3/+31
Add a new EFA flags attribute for QP creation, and support unsolicited write with immediate flag. QPs created with this flag set will not consume receive work requests for incoming RDMA write with immediate. Expose device capability bit for this feature support. Reviewed-by: Daniel Kranzdorf <dkkranzd@amazon.com> Reviewed-by: Firas Jahjah <firasj@amazon.com> Signed-off-by: Michael Margolin <mrgolin@amazon.com> Link: https://lore.kernel.org/r/20240506151829.6475-1-mrgolin@amazon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-05-05IB/hfi1: Remove generic .ndo_get_stats64Breno Leitao1-1/+0
Commit 3e2f544dd8a33 ("net: get stats64 if device if driver is configured") moved the callback to dev_get_tstats64() to net core, so, unless the driver is doing some custom stats collection, it does not need to set .ndo_get_stats64. Since this driver is now relying in NETDEV_PCPU_STAT_TSTATS, then, it doesn't need to set the dev_get_tstats64() generic .ndo_get_stats64 function pointer. Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://lore.kernel.org/r/20240503111333.552360-2-leitao@debian.org Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-05-05IB/hfi1: Do not use custom stat allocatorBreno Leitao1-16/+3
With commit 34d21de99cea9 ("net: Move {l,t,d}stats allocation to core and convert veth & vrf"), stats allocation could be done on net core instead of in this driver. With this new approach, the driver doesn't have to bother with error handling (allocation failure checking, making sure free happens in the right spot, etc). This is core responsibility now. Remove the allocation in the hfi1 driver and leverage the network core allocation instead. Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://lore.kernel.org/r/20240503111333.552360-1-leitao@debian.org Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-05-05RDMA/hfi1: Use RMW accessors for changing LNKCTL2Ilpo Järvinen1-22/+8
Convert open coded RMW accesses for LNKCTL2 to use pcie_capability_clear_and_set_word() which makes its easier to understand what the code tries to do. In addition, this futureproofs the code. LNKCTL2 is not really owned by any driver because it is a collection of control bits that PCI core might need to touch. RMW accessors already have support for proper locking for a selected set of registers to avoid losing concurrent updates (LNKCTL2 is not yet among the registers that need protection but likely will be in the future). Suggested-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/r/20240503133640.15899-1-ilpo.jarvinen@linux.intel.com Reviewed-by: Dean Luick <dean.luick@cornelisnetworks.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-05-05RDMA/mana_ib: implement uapi for creation of rnic cqKonstantin Taranov1-4/+51
Enable users to create RNIC CQs using a corresponding flag. With the previous request size, an ethernet CQ is created. As a response, return ID of the created CQ. Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Link: https://lore.kernel.org/r/1714137160-5222-6-git-send-email-kotaranov@linux.microsoft.com Reviewed-by: Long Li <longli@microsoft.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-05-05RDMA/mana_ib: boundary check before installing cq callbacksKonstantin Taranov1-0/+2
Add a boundary check inside mana_ib_install_cq_cb to prevent index overflow. Fixes: 2a31c5a7e0d8 ("RDMA/mana_ib: Introduce mana_ib_install_cq_cb helper function") Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Link: https://lore.kernel.org/r/1714137160-5222-5-git-send-email-kotaranov@linux.microsoft.com Reviewed-by: Long Li <longli@microsoft.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-05-05RDMA/mana_ib: introduce a helper to remove cq callbacksKonstantin Taranov3-29/+17
Intoduce the mana_ib_remove_cq_cb helper to remove cq callbacks. The helper removes code duplicates. Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Link: https://lore.kernel.org/r/1714137160-5222-4-git-send-email-kotaranov@linux.microsoft.com Reviewed-by: Long Li <longli@microsoft.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-05-05RDMA/mana_ib: create and destroy RNIC cqsKonstantin Taranov2-0/+86
Implement RNIC requests for creation and destruction of RNIC CQs. Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Link: https://lore.kernel.org/r/1714137160-5222-3-git-send-email-kotaranov@linux.microsoft.com Reviewed-by: Long Li <longli@microsoft.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-05-05RDMA/mana_ib: create EQs for RNIC CQsKonstantin Taranov2-2/+33
Create EQs within mana_ib device. Such EQs are required for creation of RNIC CQs. Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Link: https://lore.kernel.org/r/1714137160-5222-2-git-send-email-kotaranov@linux.microsoft.com Reviewed-by: Long Li <longli@microsoft.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-05-04IB/qib: Use device_show_string() helper for sysfs attributesLukas Wunner3-15/+2
Deduplicate sysfs ->show() callbacks which expose a string at a static memory location. Use the newly introduced device_show_string() helper in the driver core instead by declaring those sysfs attributes with DEVICE_STRING_ATTR_RO(). No functional change intended. Signed-off-by: Lukas Wunner <lukas@wunner.de> Link: https://lore.kernel.org/r/185a88dd3e6e18bf508a875d87c95ea2a5c3fa13.1713608122.git.lukas@wunner.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-03IB/hfi1: allocate dummy net_device dynamicallyBreno Leitao2-3/+8
Embedding net_device into structures prohibits the usage of flexible arrays in the net_device structure. For more details, see the discussion at [1]. Un-embed the net_device from struct hfi1_netdev_rx by converting it into a pointer. Then use the leverage alloc_netdev() to allocate the net_device object at hfi1_alloc_rx(). [1] https://lore.kernel.org/all/20240229225910.79e224cf@kernel.org/ Acked-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Breno Leitao <leitao@debian.org> Acked-by: Leon Romanovsky <leon@kernel.org> Link: https://lore.kernel.org/r/20240430162213.746492-1-leitao@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-02RDMA/mlx5: Remove NULL check before dev_{put, hold}Jules Irenge1-2/+1
Coccinelle reports a warning WARNING: NULL check before dev_{put, hold} functions is not needed The reason is the call netdev_{put, hold} of dev_{put,hold} will check NULL There is no need to check before using dev_{put, hold} Signed-off-by: Jules Irenge <jbi.octave@gmail.com> Link: https://lore.kernel.org/r/ZjGC4qXrOwZE0aHi@octinomon.home Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-30RDMA/mlx5: Track DCT, DCI and REG_UMR QPs as diver_detail resources.Chiara Meiohas2-2/+30
Allow user to see driver-specific QPs (the "driver_detail" QPs) through the rdmatool, when requested. When creating DCT, DCI and REG_UMR QPs, we designate them as driver_detail resources. When filling the QP info for the rdma tool, for the driver_detail QPs: -the QP type is IB_QPT_DRIVER -the subtype is a string with the QP name ("DCT", "DCI", "REG_UMR") Signed-off-by: Chiara Meiohas <cmeiohas@nvidia.com> Link: https://lore.kernel.org/r/452432d7d0917f053a80a893a614169857fe3b10.1713268997.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-29RDMA/efa: Add shutdown notifierMichael Margolin1-0/+11
Add driver function to stop the device and release any active IRQs as preparation for shutdown. This should fix issues caused by unexpected AQ interrupts when booting kernel using kexec and possible data integrity issues when the system is being shutdown during traffic. Link: https://lore.kernel.org/r/20240425171814.25216-1-mrgolin@amazon.com Reviewed-by: Firas Jahjah <firasj@amazon.com> Reviewed-by: Yonatan Nachum <ynachum@amazon.com> Signed-off-by: Michael Margolin <mrgolin@amazon.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-04-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-1/+2
Cross-merge networking fixes after downstream PR. Conflicts: drivers/net/ethernet/ti/icssg/icssg_prueth.c net/mac80211/chan.c 89884459a0b9 ("wifi: mac80211: fix idle calculation with multi-link") 87f5500285fb ("wifi: mac80211: simplify ieee80211_assign_link_chanctx()") https://lore.kernel.org/all/20240422105623.7b1fbda2@canb.auug.org.au/ net/unix/garbage.c 1971d13ffa84 ("af_unix: Suppress false-positive lockdep splat for spin_lock() in __unix_gc().") 4090fa373f0e ("af_unix: Replace garbage collection algorithm.") drivers/net/ethernet/ti/icssg/icssg_prueth.c drivers/net/ethernet/ti/icssg/icssg_common.c 4dcd0e83ea1d ("net: ti: icssg-prueth: Fix signedness bug in prueth_init_rx_chns()") e2dc7bfd677f ("net: ti: icssg-prueth: Move common functions into a separate file") No adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-25RDMA/vmw_pvrdma: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACYDamien Le Moal1-1/+1
Use the macro PCI_IRQ_INTX instead of the deprecated PCI_IRQ_LEGACY macro. Link: https://lore.kernel.org/r/20240325070944.3600338-13-dlemoal@kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-04-25IB/qib: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACYDamien Le Moal3-5/+4
Use the macro PCI_IRQ_INTX instead of the deprecated PCI_IRQ_LEGACY macro. Link: https://lore.kernel.org/r/20240325070944.3600338-12-dlemoal@kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
2024-04-23RDMA/mana_ib: Fix missing ret valueKonstantin Taranov1-0/+1
Set ret to -ENODEV when netdev_master_upper_dev_get_rcu returns NULL. Fixes: 8b184e4f1c32 ("RDMA/mana_ib: Enable RoCE on port 1") Link: https://lore.kernel.org/r/1713881751-21621-1-git-send-email-kotaranov@linux.microsoft.com Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-04-22IB/mlx5: Use __iowrite64_copy() for write combining storesJason Gunthorpe1-5/+3
mlx5 has a built in self-test at driver startup to evaluate if the platform supports write combining to generate a 64 byte PCIe TLP or not. This has proven necessary because a lot of common scenarios end up with broken write combining (especially inside virtual machines) and there is other way to learn this information. This self test has been consistently failing on new ARM64 CPU designs (specifically with NVIDIA Grace's implementation of Neoverse V2). The C loop around writeq() generates some pretty terrible ARM64 assembly, but historically this has worked on a lot of existing ARM64 CPUs till now. We see it succeed about 1 time in 10,000 on the worst effected systems. The CPU architects speculate that the load instructions interspersed with the stores makes the WC buffers statistically flush too often and thus the generation of large TLPs becomes infrequent. This makes the boot up test unreliable in that it indicates no write-combining, however userspace would be fine since it uses a ST4 instruction. Further, S390 has similar issues where only the special zpci_memcpy_toio() will actually generate large TLPs, and the open coded loop does not trigger it at all. Fix both ARM64 and S390 by switching to __iowrite64_copy() which now provides architecture specific variants that have a high change of generating a large TLP with write combining. x86 continues to use a similar writeq loop in the generate __iowrite64_copy(). Fixes: 11f552e21755 ("IB/mlx5: Test write combining support") Link: https://lore.kernel.org/r/6-v3-1893cd8b9369+1925-mlx5_arm_wc_jgg@nvidia.com Tested-by: Niklas Schnelle <schnelle@linux.ibm.com> Acked-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-04-16RDMA/hns: Modify the print level of CQE errorChengchang Tang1-2/+3
Too much print may lead to a panic in kernel. Change ibdev_err() to ibdev_err_ratelimited(), and change the printing level of cqe dump to debug level. Fixes: 7c044adca272 ("RDMA/hns: Simplify the cqe code of poll cq") Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://lore.kernel.org/r/20240412091616.370789-11-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/hns: Use complete parentheses in macrosChengchang Tang1-6/+6
Use complete parentheses to ensure that macro expansion does not produce unexpected results. Fixes: a25d13cbe816 ("RDMA/hns: Add the interfaces to support multi hop addressing for the contexts in hip08") Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://lore.kernel.org/r/20240412091616.370789-10-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/hns: Add mutex_destroy()wenglianfa6-5/+39
Add mutex_destroy(). Signed-off-by: wenglianfa <wenglianfa@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://lore.kernel.org/r/20240412091616.370789-9-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/hns: Fix GMV table pagesizeChengchang Tang1-1/+1
GMV's BA table only supports 4K pages. Currently, PAGESIZE is used to calculate gmv_bt_num, which will cause an abnormal number of gmv_bt_num in a 64K OS. Fixes: d6d91e46210f ("RDMA/hns: Add support for configuring GMV table") Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://lore.kernel.org/r/20240412091616.370789-8-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/hns: Fix mismatch exception rollbackwenglianfa1-1/+1
When dma_alloc_coherent() fails in hns_roce_alloc_hem(), just call kfree() to release hem instead of hns_roce_free_hem(). Fixes: c00743cbf2b8 ("RDMA/hns: Simplify 'struct hns_roce_hem' allocation") Signed-off-by: wenglianfa <wenglianfa@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://lore.kernel.org/r/20240412091616.370789-7-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/hns: Fix UAF for cq async eventChengchang Tang1-11/+13
The refcount of CQ is not protected by locks. When CQ asynchronous events and CQ destruction are concurrent, CQ may have been released, which will cause UAF. Use the xa_lock() to protect the CQ refcount. Fixes: 9a4435375cd1 ("IB/hns: Add driver files for hns RoCE driver") Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://lore.kernel.org/r/20240412091616.370789-6-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/hns: Fix deadlock on SRQ async events.Chengchang Tang2-3/+4
xa_lock for SRQ table may be required in AEQ. Use xa_store_irq()/ xa_erase_irq() to avoid deadlock. Fixes: 81fce6291d99 ("RDMA/hns: Add SRQ asynchronous event support") Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://lore.kernel.org/r/20240412091616.370789-5-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/hns: Add max_ah and cq moderation capacities in query_device()Chengchang Tang4-2/+12
Add max_ah and cq moderation capacities to hns_roce_query_device(). Fixes: 9a4435375cd1 ("IB/hns: Add driver files for hns RoCE driver") Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://lore.kernel.org/r/20240412091616.370789-4-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/hns: Remove unused parameters and variablesChengchang Tang7-33/+20
Remove unused parameters and variables. Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://lore.kernel.org/r/20240412091616.370789-3-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/hns: Use macro instead of magic numberYangyang Li3-16/+34
Use macro instead of magic number. Signed-off-by: Yangyang Li <liyangyang20@huawei.com> Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://lore.kernel.org/r/20240412091616.370789-2-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/hns: Fix return value in hns_roce_map_mr_sgZhengchao Shao1-8/+7
As described in the ib_map_mr_sg function comment, it returns the number of sg elements that were mapped to the memory region. However, hns_roce_map_mr_sg returns the number of pages required for mapping the DMA area. Fix it. Fixes: 9b2cf76c9f05 ("RDMA/hns: Optimize PBL buffer allocation process") Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Link: https://lore.kernel.org/r/20240411033851.2884771-1-shaozhengchao@huawei.com Reviewed-by: Junxian Huang <huangjunxian6@hisilicon.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/mana_ib: Configure mac address in RNICKonstantin Taranov3-0/+46
Set local mac address in RNIC, which is required by the HW. Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Link: https://lore.kernel.org/r/1712738551-22075-7-git-send-email-kotaranov@linux.microsoft.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/mana_ib: Adding and deleting GIDsKonstantin Taranov3-0/+97
Implement add_gid and del_gid for RNIC. IPv4 and IPv6 addresses are supported. Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Link: https://lore.kernel.org/r/1712738551-22075-6-git-send-email-kotaranov@linux.microsoft.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/mana_ib: Enable RoCE on port 1Konstantin Taranov2-4/+26
Set netdev and RoCEv2 flag to enable GID population on port 1. Use GIDs of the master netdev. As mc->ports[] stores slave devices, use a helper to get the master netdev. Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Link: https://lore.kernel.org/r/1712738551-22075-5-git-send-email-kotaranov@linux.microsoft.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/mana_ib: Implement port parametersKonstantin Taranov3-1/+42
Implement port parameters for RNIC: 1) extend query_port() method 2) implement get_link_layer() 3) implement query_pkey() Only port 1 can store GIDs. Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Link: https://lore.kernel.org/r/1712738551-22075-4-git-send-email-kotaranov@linux.microsoft.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/mana_ib: Create and destroy rnic adapterKonstantin Taranov3-1/+79
Add functions for RNIC creation and destruction. If creation fails, the ib_probe fails as well. Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Link: https://lore.kernel.org/r/1712738551-22075-3-git-send-email-kotaranov@linux.microsoft.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/mana_ib: Add EQ creation for rnic adapterKonstantin Taranov3-3/+41
Create an error EQ for the RNIC adapter. Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Link: https://lore.kernel.org/r/1712738551-22075-2-git-send-email-kotaranov@linux.microsoft.com Signed-off-by: Leon Romanovsky <leon@kernel.org>