summaryrefslogtreecommitdiff
path: root/drivers/infiniband/hw
AgeCommit message (Collapse)AuthorFilesLines
2024-02-04RDMA/irdma: Remove duplicate assignmentMustafa Ismail1-2/+1
Remove the unneeded assignment of the qp_num which is already set in irdma_create_qp(). Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs") Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Sindhu Devale <sindhu.devale@intel.com> Link: https://lore.kernel.org/r/20240131233953.400483-1-sindhu.devale@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-01-31RDMA/efa: Limit EQs to available MSI-X vectorsYonatan Nachum2-17/+16
When creating EQs we take into consideration the max number of EQs the device reported it can support and the number of available CPUs. There are situations where the number of EQs the device reported it can support and the PCI configuration of MSI-X is different, take it in account as well when creating EQs. Also request at least 1 MSI-X vector for the management queue and allow the kernel to return a number of vectors in a range between 1 and the max supported MSI-X vectors according to the PCI config. Reviewed-by: Michael Margolin <mrgolin@amazon.com> Reviewed-by: Yonatan Goldhirsh <ygold@amazon.com> Signed-off-by: Yonatan Nachum <ynachum@amazon.com> Link: https://lore.kernel.org/r/20240131093403.18624-1-ynachum@amazon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-01-25RDMA/mlx5: Delete unused mlx5_ib_copy_pas prototypeAlexey Dobriyan1-1/+0
mlx5_ib_copy_pas() doesn't exist anymore. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Link: https://lore.kernel.org/r/a2cb861e-d11e-4567-8a73-73763d1dc199@p183 Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-01-25RDMA/cxgb4: Delete unused c4iw_ep_redirect prototypeAlexey Dobriyan1-2/+0
c4iw_ep_redirect() doesn't exist. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Link: https://lore.kernel.org/r/a67881e7-469b-43d9-9973-ad7579eb2c4b@p183 Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-01-25RDMA/mana_ib: Introduce mana_ib_install_cq_cb helper functionKonstantin Taranov3-26/+27
Use a helper function to install callbacks to CQs. This patch removes code repetition. Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Link: https://lore.kernel.org/r/1705965781-3235-4-git-send-email-kotaranov@linux.microsoft.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-01-25RDMA/mana_ib: Introduce mana_ib_get_netdev helper functionKonstantin Taranov3-41/+32
Use a helper function to access netdevs using a port number. This patch removes code repetitions as well as removes the need to explicitly use gdma_dev, which was error-prone. Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Link: https://lore.kernel.org/r/1705965781-3235-3-git-send-email-kotaranov@linux.microsoft.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-01-25RDMA/mana_ib: Introduce mdev_to_gc helper functionKonstantin Taranov5-37/+27
Use a helper function to access gdma_context from mana_ib_dev. This patch removes code repetitions as well as removes the need to explicitly use gdma_dev, which was error-prone. Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Link: https://lore.kernel.org/r/1705965781-3235-2-git-send-email-kotaranov@linux.microsoft.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-01-25RDMA/hns: Simplify 'struct hns_roce_hem' allocationYunsheng Lin3-136/+24
'struct hns_roce_hem' is used to refer to the last level of dma buffer managed by the hw, pointed by a single BA(base address) in the previous level of BT(base table), so the dma buffer in 'struct hns_roce_hem' must be contiguous. Right now the size of dma buffer in 'struct hns_roce_hem' is decided by mhop->buf_chunk_size in get_hem_table_config(), which ensure the mhop->buf_chunk_size is power of two of PAGE_SIZE, so there will be only one contiguous dma buffer allocated in hns_roce_alloc_hem(), which means hem->chunk_list and chunk->mem for linking multi dma buffers is unnecessary. This patch removes the hem->chunk_list and chunk->mem and other related macro and function accordingly. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://lore.kernel.org/r/20240113085935.2838701-7-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-01-25RDMA/hns: Support adaptive PBL hopnumChengchang Tang1-3/+54
In the current implementation, a fixed addressing level is used for PBL. But in fact, the necessary addressing level is related to page size and the size of MR. This patch calculates the addressing level according to page size and the size of MR, and uses the addressing level to configure the PBL. Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://lore.kernel.org/r/20240113085935.2838701-6-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-01-25RDMA/hns: Support flexible umem page sizeChengchang Tang2-38/+92
In the current implementation, a fixed page size is used to configure the umem PBL, which is not flexible enough and is not conducive to the performance of the HW. Find a best page size to get better performance. Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://lore.kernel.org/r/20240113085935.2838701-5-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-01-25RDMA/hns: Alloc MTR memory before alloc_mtt()Chengchang Tang1-20/+27
MTR memory allocation do not depend on allocation of mtt. This patch moves the allocation of mtr before mtt in preparation for the following optimization. Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://lore.kernel.org/r/20240113085935.2838701-4-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-01-25RDMA/hns: Refactor mtr_init_buf_cfg()Chengchang Tang1-31/+45
page_shift and page_cnt is only used in mtr_map_bufs(). And these parameter could be calculated indepedently. Strip the computation of page_shift and page_cnt from mtr_init_buf_cfg(), reducing the number of parameters of it. This helps reducing coupling between mtr_init_buf_cfg() and mtr_map_bufs(). Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://lore.kernel.org/r/20240113085935.2838701-3-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-01-25RDMA/hns: Refactor mtr findChengchang Tang4-85/+121
hns_roce_mtr_find() is a collection of multiple functions, and the return value is also difficult to understand, which is not conducive to modification and maintenance. Separate the function of obtaining MTR root BA from this function. And some adjustments has been made to improve readability. Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://lore.kernel.org/r/20240113085935.2838701-2-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-01-25IB/hfi1: fix spellos and kernel-docRandy Dunlap1-12/+13
Fix spelling mistakes as reported by codespell. Fix kernel-doc warnings. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Leon Romanovsky <leonro@nvidia.com> Cc: linux-rdma@vger.kernel.org Link: https://lore.kernel.org/r/20240111042754.17530-1-rdunlap@infradead.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-01-13Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds50-296/+1509
Pull rdma updates from Jason Gunthorpe: "Small cycle, with some typical driver updates: - General code tidying in siw, hfi1, idrdma, usnic, hns rtrs and bnxt_re - Many small siw cleanups without an overeaching theme - Debugfs stats for hns - Fix a TX queue timeout in IPoIB and missed locking of the mcast list - Support more features of P7 devices in bnxt_re including a new work submission protocol - CQ interrupts for MANA - netlink stats for erdma - EFA multipath PCI support - Fix Incorrect MR invalidation in iser" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (66 commits) RDMA/bnxt_re: Fix error code in bnxt_re_create_cq() RDMA/efa: Add EFA query MR support IB/iser: Prevent invalidating wrong MR RDMA/erdma: Add hardware statistics support RDMA/erdma: Introduce dma pool for hardware responses of CMDQ requests IB/iser: iscsi_iser.h: fix kernel-doc warning and spellos RDMA/mana_ib: Add CQ interrupt support for RAW QP RDMA/mana_ib: query device capabilities RDMA/mana_ib: register RDMA device with GDMA RDMA/bnxt_re: Fix the sparse warnings RDMA/bnxt_re: Fix the offset for GenP7 adapters for user applications RDMA/bnxt_re: Share a page to expose per CQ info with userspace RDMA/bnxt_re: Add UAPI to share a page with user space IB/ipoib: Fix mcast list locking RDMA/mlx5: Expose register c0 for RDMA device net/mlx5: E-Switch, expose eswitch manager vport net/mlx5: Manage ICM type of SW encap RDMA/mlx5: Support handling of SW encap ICM area net/mlx5: Introduce indirect-sw-encap ICM properties RDMA/bnxt_re: Adds MSN table capability for Gen P7 adapters ...
2024-01-09Merge tag 'arm64-upstream' of ↵Linus Torvalds1-2/+0
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Will Deacon: "CPU features: - Remove ARM64_HAS_NO_HW_PREFETCH copy_page() optimisation for ye olde Thunder-X machines - Avoid mapping KPTI trampoline when it is not required - Make CPU capability API more robust during early initialisation Early idreg overrides: - Remove dependencies on core kernel helpers from the early command-line parsing logic in preparation for moving this code before the kernel is mapped FPsimd: - Restore kernel-mode fpsimd context lazily, allowing us to run fpsimd code sequences in the kernel with pre-emption enabled KBuild: - Install 'vmlinuz.efi' when CONFIG_EFI_ZBOOT=y - Makefile cleanups LPA2 prep: - Preparatory work for enabling the 'LPA2' extension, which will introduce 52-bit virtual and physical addressing even with 4KiB pages (including for KVM guests). Misc: - Remove dead code and fix a typo MM: - Pass NUMA node information for IRQ stack allocations Perf: - Add perf support for the Synopsys DesignWare PCIe PMU - Add support for event counting thresholds (FEAT_PMUv3_TH) introduced in Armv8.8 - Add support for i.MX8DXL SoCs to the IMX DDR PMU driver. - Minor PMU driver fixes and optimisations RIP VPIPT: - Remove what support we had for the obsolete VPIPT I-cache policy Selftests: - Improvements to the SVE and SME selftests Stacktrace: - Refactor kernel unwind logic so that it can used by BPF unwinding and, eventually, reliable backtracing Sysregs: - Update a bunch of register definitions based on the latest XML drop from Arm" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (87 commits) kselftest/arm64: Don't probe the current VL for unsupported vector types efi/libstub: zboot: do not use $(shell ...) in cmd_copy_and_pad arm64: properly install vmlinuz.efi arm64/sysreg: Add missing system instruction definitions for FGT arm64/sysreg: Add missing system register definitions for FGT arm64/sysreg: Add missing ExtTrcBuff field definition to ID_AA64DFR0_EL1 arm64/sysreg: Add missing Pauth_LR field definitions to ID_AA64ISAR1_EL1 arm64: memory: remove duplicated include arm: perf: Fix ARCH=arm build with GCC arm64: Align boot cpucap handling with system cpucap handling arm64: Cleanup system cpucap handling MAINTAINERS: add maintainers for DesignWare PCIe PMU driver drivers/perf: add DesignWare PCIe PMU driver PCI: Move pci_clear_and_set_dword() helper to PCI header PCI: Add Alibaba Vendor ID to linux/pci_ids.h docs: perf: Add description for Synopsys DesignWare PCIe PMU driver arm64: irq: set the correct node for shadow call stack Revert "perf/arm_dmc620: Remove duplicate format attribute #defines" arm64: fpsimd: Implement lazy restore for kernel mode FPSIMD arm64: fpsimd: Preserve/restore kernel mode NEON at context switch ...
2024-01-08Merge tag 'vfs-6.8.misc' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull misc vfs updates from Christian Brauner: "This contains the usual miscellaneous features, cleanups, and fixes for vfs and individual fses. Features: - Add Jan Kara as VFS reviewer - Show correct device and inode numbers in proc/<pid>/maps for vma files on stacked filesystems. This is now easily doable thanks to the backing file work from the last cycles. This comes with selftests Cleanups: - Remove a redundant might_sleep() from wait_on_inode() - Initialize pointer with NULL, not 0 - Clarify comment on access_override_creds() - Rework and simplify eventfd_signal() and eventfd_signal_mask() helpers - Process aio completions in batches to avoid needless wakeups - Completely decouple struct mnt_idmap from namespaces. We now only keep the actual idmapping around and don't stash references to namespaces - Reformat maintainer entries to indicate that a given subsystem belongs to fs/ - Simplify fput() for files that were never opened - Get rid of various pointless file helpers - Rename various file helpers - Rename struct file members after SLAB_TYPESAFE_BY_RCU switch from last cycle - Make relatime_need_update() return bool - Use GFP_KERNEL instead of GFP_USER when allocating superblocks - Replace deprecated ida_simple_*() calls with their current ida_*() counterparts Fixes: - Fix comments on user namespace id mapping helpers. They aren't kernel doc comments so they shouldn't be using /** - s/Retuns/Returns/g in various places - Add missing parameter documentation on can_move_mount_beneath() - Rename i_mapping->private_data to i_mapping->i_private_data - Fix a false-positive lockdep warning in pipe_write() for watch queues - Improve __fget_files_rcu() code generation to improve performance - Only notify writer that pipe resizing has finished after setting pipe->max_usage otherwise writers are never notified that the pipe has been resized and hang - Fix some kernel docs in hfsplus - s/passs/pass/g in various places - Fix kernel docs in ntfs - Fix kcalloc() arguments order reported by gcc 14 - Fix uninitialized value in reiserfs" * tag 'vfs-6.8.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (36 commits) reiserfs: fix uninit-value in comp_keys watch_queue: fix kcalloc() arguments order ntfs: dir.c: fix kernel-doc function parameter warnings fs: fix doc comment typo fs tree wide selftests/overlayfs: verify device and inode numbers in /proc/pid/maps fs/proc: show correct device and inode numbers in /proc/pid/maps eventfd: Remove usage of the deprecated ida_simple_xx() API fs: super: use GFP_KERNEL instead of GFP_USER for super block allocation fs/hfsplus: wrapper.c: fix kernel-doc warnings fs: add Jan Kara as reviewer fs/inode: Make relatime_need_update return bool pipe: wakeup wr_wait after setting max_usage file: remove __receive_fd() file: stop exposing receive_fd_user() fs: replace f_rcuhead with f_task_work file: remove pointless wrapper file: s/close_fd_get_file()/file_close_fd()/g Improve __fget_files_rcu() code generation (and thus __fget_light()) file: massage cleanup of files that failed to open fs/pipe: Fix lockdep false-positive in watchqueue pipe_write() ...
2024-01-08RDMA/bnxt_re: Fix error code in bnxt_re_create_cq()Dan Carpenter1-2/+4
Return -ENOMEM if get_zeroed_page() fails. Don't return success. Fixes: e275919d9669 ("RDMA/bnxt_re: Share a page to expose per CQ info with userspace") Link: https://lore.kernel.org/r/d714306e-b7d7-4e89-b973-a9ff0f260c78@moroto.mountain Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Acked-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-01-07RDMA/efa: Add EFA query MR supportMichael Margolin6-6/+140
Add EFA driver uapi definitions and register a new query MR method that currently returns the physical interconnects the device is using to reach the MR. Update admin definitions and efa-abi accordingly. Reviewed-by: Anas Mousa <anasmous@amazon.com> Reviewed-by: Firas Jahjah <firasj@amazon.com> Signed-off-by: Michael Margolin <mrgolin@amazon.com> Link: https://lore.kernel.org/r/20240104095155.10676-1-mrgolin@amazon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-12-30RDMA/erdma: Add hardware statistics supportCheng Xu4-0/+133
First, we add a new command to query hardware statistics, and then implement two functions: ib_device_ops.alloc_hw_port_stats and ib_device_ops.get_hw_stats to allow rdma tool can get the statistics of erdma device. Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com> Link: https://lore.kernel.org/r/20231227084800.99091-3-chengyou@linux.alibaba.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-12-30RDMA/erdma: Introduce dma pool for hardware responses of CMDQ requestsCheng Xu3-2/+26
Hardware response, such as the result of query statistics, may be too long to be directly accommodated within the CQE structure. To address this, we introduce a DMA pool to hold the hardware's responses of CMDQ requests. Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com> Link: https://lore.kernel.org/r/20231227084800.99091-2-chengyou@linux.alibaba.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-12-20RDMA/mana_ib: Add CQ interrupt support for RAW QPLong Li3-7/+102
At probing time, the MANA core code allocates EQs for supporting interrupts on Ethernet queues. The same interrupt mechanisum is used by RAW QP. Use the same EQs for delivering interrupts on the CQ for the RAW QP. Signed-off-by: Long Li <longli@microsoft.com> Link: https://lore.kernel.org/r/1702692255-23640-4-git-send-email-longli@linuxonhyperv.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-12-20RDMA/mana_ib: query device capabilitiesLong Li5-16/+112
With RDMA device registered, use it to query on hardware capabilities and cache this information for future query requests to the driver. Signed-off-by: Long Li <longli@microsoft.com> Link: https://lore.kernel.org/r/1702692255-23640-3-git-send-email-longli@linuxonhyperv.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-12-20RDMA/mana_ib: register RDMA device with GDMALong Li3-14/+29
Software client needs to register with the RDMA management interface on the SoC to access more features, including querying device capabilities and RC queue pair. Signed-off-by: Long Li <longli@microsoft.com> Link: https://lore.kernel.org/r/1702692255-23640-2-git-send-email-longli@linuxonhyperv.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-12-20RDMA/bnxt_re: Fix the sparse warningsSelvin Xavier2-2/+2
Fix the following warnings reported drivers/infiniband/hw/bnxt_re/qplib_rcfw.c:909:27: warning: invalid assignment: |= drivers/infiniband/hw/bnxt_re/qplib_rcfw.c:909:27: left side has type restricted __le16 drivers/infiniband/hw/bnxt_re/qplib_rcfw.c:909:27: right side has type unsigned long ... drivers/infiniband/hw/bnxt_re/qplib_fp.c:1620:44: warning: invalid assignment: |= drivers/infiniband/hw/bnxt_re/qplib_fp.c:1620:44: left side has type restricted __le64 drivers/infiniband/hw/bnxt_re/qplib_fp.c:1620:44: right side has type unsigned long long Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202312200537.HoNqPL5L-lkp@intel.com/ Fixes: 07f830ae4913 ("RDMA/bnxt_re: Adds MSN table capability for Gen P7 adapters") Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://lore.kernel.org/r/1703046717-8914-1-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-12-20RDMA/bnxt_re: Fix the offset for GenP7 adapters for user applicationsSelvin Xavier1-3/+5
User Doorbell page indexes start at an offset for GenP7 adapters. Fix the offset that will be used for user doorbell page indexes. Fixes: a62d68581441 ("RDMA/bnxt_re: Update the BAR offsets") Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://lore.kernel.org/r/1702987900-5363-1-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-12-17RDMA/bnxt_re: Share a page to expose per CQ info with userspaceSelvin Xavier5-8/+74
Gen P7 adapters needs to share a toggle bits information received in kernel driver with the user space. User space needs this info during the request notify call back to arm the CQ. User space application can get this page using the UAPI routines. Library will mmap this page and get the toggle bits to be used in the next ARM Doorbell. Uses a hash list to map the CQ structure from the CQ ID. CQ structure is retrieved from the hash list while the library calls the UAPI routine to get the toggle page mapping. Currently the full page is mapped per CQ. This can be optimized to enable multiple CQs from the same application share the same page and different offsets in the page. Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://lore.kernel.org/r/1702535484-26844-3-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-12-17RDMA/bnxt_re: Add UAPI to share a page with user spaceSelvin Xavier2-0/+106
Gen P7 adapters require to share a toggle value for CQ and SRQ. This is received by the driver as part of interrupt notifications and needs to be shared with the user space. Add a new UAPI infrastructure to get the shared page for CQ and SRQ. Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://lore.kernel.org/r/1702535484-26844-2-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-12-13PCI: Add Alibaba Vendor ID to linux/pci_ids.hShuai Xue1-2/+0
The Alibaba Vendor ID (0x1ded) is now used by Alibaba elasticRDMA ("erdma") and will be shared with the upcoming PCIe PMU ("dwc_pcie_pmu"). Move the Vendor ID to linux/pci_ids.h so that it can shared by several drivers later. Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> # pci_ids.h Tested-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Link: https://lore.kernel.org/r/20231208025652.87192-3-xueshuai@linux.alibaba.com Signed-off-by: Will Deacon <will@kernel.org>
2023-12-12Expose c0 and SW encap ICM for RDMALeon Romanovsky10-22/+79
These two series from Mark and Shun extend RDMA mlx5 API. Mark's series provides c0 register used to match egress traffic sent by local device. Shun's series adds new type for ICM area. Link: https://lore.kernel.org/all/cover.1701871118.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-12-12RDMA/mlx5: Expose register c0 for RDMA deviceMark Bloch1-0/+24
This patch introduces improvements for matching egress traffic sent by the local device. When applicable, all egress traffic from the local vport is now tagged with the provided value. This enhancement is particularly useful for FDB steering purposes. The primary focus of this update is facilitating the transmission of traffic from the hypervisor to a VF. To achieve this, one must initiate an SQ on the hypervisor and subsequently create a rule in the FDB that matches on the eswitch manager vport and the SQN of the aforementioned SQ. Obtaining the SQN can be had from SQ opened, and the eswitch manager vport match can be substituted with the register c0 value exposed by this patch. Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Michael Guralnik <michaelgur@nvidia.com> Link: https://lore.kernel.org/r/aa4120a91c98ff1c44f1213388c744d4cb0324d6.1701871118.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-12-12RDMA/mlx5: Support handling of SW encap ICM areaShun Hao2-0/+6
New type for this ICM area, now the user can allocate/deallocate the new type of SW encap ICM memory, to store the encap header data which are managed by SW. Signed-off-by: Shun Hao <shunh@nvidia.com> Link: https://lore.kernel.org/r/546fe43fc700240709e30acf7713ec6834d652bd.1701871118.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-12-11RDMA/bnxt_re: Adds MSN table capability for Gen P7 adaptersSelvin Xavier4-6/+86
GenP7 HW expects an MSN table instead of PSN table. Check for the HW retransmission capability and populate the MSN table if HW retansmission is supported. Signed-off-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://lore.kernel.org/r/1701946060-13931-7-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-12-11RDMA/bnxt_re: Doorbell changesSelvin Xavier1-11/+35
Update the Doorbell routines to support the latest HW definitions. Use common routine to prepare the Doorbell key. Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://lore.kernel.org/r/1701946060-13931-6-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-12-11RDMA/bnxt_re: Get the toggle bits from CQ completionsSelvin Xavier3-0/+8
Get the toggle bits from CQ completions. For older adapters these values are 0. Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://lore.kernel.org/r/1701946060-13931-5-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-12-11RDMA/bnxt_re: Update the HW interface definitionsSelvin Xavier1-10/+57
Adds HW interface definitions to support the new chip revision. Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://lore.kernel.org/r/1701946060-13931-4-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-12-11RDMA/bnxt_re: Update the BAR offsetsSelvin Xavier2-16/+10
Update the BAR offsets for handling GenP7 adapters. Use the values populated by L2 driver for getting the Doorbell offsets. Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://lore.kernel.org/r/1701946060-13931-3-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-12-11RDMA/bnxt_re: Support new 5760X P7 devicesSelvin Xavier8-24/+38
Add basic support for 5760X P7 devices. Add new chip revisions. The first version support is similar to the existing P5 adapters. Extend the current support for P5 adapters to P7 also. Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://lore.kernel.org/r/1701946060-13931-2-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-12-07RDMA/hns: Fix memory leak in free_mr_init()Chengchang Tang1-0/+4
When a reserved QP fails to be created, the memory of the remaining created reserved QPs is leaked. Fixes: 70f92521584f ("RDMA/hns: Use the reserved loopback QPs to free MR before destroying MPT") Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://lore.kernel.org/r/20231207114231.2872104-6-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-12-07RDMA/hns: Remove unnecessary checks for NULL in mtr_alloc_bufs()Chengchang Tang1-1/+1
ib_umem_get() never return NULL. Fixes: 3c873161a0d7 ("RDMA/hns: Add support for addressing when hopnum is 0") Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://lore.kernel.org/r/20231207114231.2872104-5-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-12-07RDMA/hns: Add a max length of gid tableJunxian Huang1-2/+9
IB-core and rdma-core restrict the sgid_index specified by users, which is uint8_t/u8 data type, to only be within the range of 0-255, so it's meaningless to support excessively large gid_table_len. On the other hand, ib-core creates as many sysfs gid files as gid_table_len, most of which are not only useless because of the reason above, but also greatly increase the traversal time of the sysfs gid files for applications. This patch limits the maximum length of gid table to 256. Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://lore.kernel.org/r/20231207114231.2872104-4-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-12-07RDMA/hns: Response dmac to userspaceJunxian Huang1-0/+7
While creating AH, dmac is already resolved in kernel. Response dmac to userspace so that userspace doesn't need to resolve dmac repeatedly. Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://lore.kernel.org/r/20231207114231.2872104-3-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-12-07RDMA/hns: Rename the interruptsChengchang Tang1-3/+4
Now, different devices may have the same interrupt name, which makes it difficult for users to distinguish between these interrupts. Modify the naming style to be consistent with our network devices. Before: "hns-aeq-0" "hns-ceq-0" ... Now: "hns-0000:35:00.0-aeq-0" "hns-0000:35:00.0-ceq-0" ... Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://lore.kernel.org/r/20231207114231.2872104-2-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-12-05RDMA/irdma: Avoid free the non-cqp_request scratchShifeng Li1-3/+1
When creating ceq_0 during probing irdma, cqp.sc_cqp will be sent as a cqp_request to cqp->sc_cqp.sq_ring. If the request is pending when removing the irdma driver or unplugging its aux device, cqp.sc_cqp will be dereferenced as wrong struct in irdma_free_pending_cqp_request(). PID: 3669 TASK: ffff88aef892c000 CPU: 28 COMMAND: "kworker/28:0" #0 [fffffe0000549e38] crash_nmi_callback at ffffffff810e3a34 #1 [fffffe0000549e40] nmi_handle at ffffffff810788b2 #2 [fffffe0000549ea0] default_do_nmi at ffffffff8107938f #3 [fffffe0000549eb8] do_nmi at ffffffff81079582 #4 [fffffe0000549ef0] end_repeat_nmi at ffffffff82e016b4 [exception RIP: native_queued_spin_lock_slowpath+1291] RIP: ffffffff8127e72b RSP: ffff88aa841ef778 RFLAGS: 00000046 RAX: 0000000000000000 RBX: ffff88b01f849700 RCX: ffffffff8127e47e RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffffff83857ec0 RBP: ffff88afe3e4efc8 R8: ffffed15fc7c9dfa R9: ffffed15fc7c9dfa R10: 0000000000000001 R11: ffffed15fc7c9df9 R12: 0000000000740000 R13: ffff88b01f849708 R14: 0000000000000003 R15: ffffed1603f092e1 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0000 -- <NMI exception stack> -- #5 [ffff88aa841ef778] native_queued_spin_lock_slowpath at ffffffff8127e72b #6 [ffff88aa841ef7b0] _raw_spin_lock_irqsave at ffffffff82c22aa4 #7 [ffff88aa841ef7c8] __wake_up_common_lock at ffffffff81257363 #8 [ffff88aa841ef888] irdma_free_pending_cqp_request at ffffffffa0ba12cc [irdma] #9 [ffff88aa841ef958] irdma_cleanup_pending_cqp_op at ffffffffa0ba1469 [irdma] #10 [ffff88aa841ef9c0] irdma_ctrl_deinit_hw at ffffffffa0b2989f [irdma] #11 [ffff88aa841efa28] irdma_remove at ffffffffa0b252df [irdma] #12 [ffff88aa841efae8] auxiliary_bus_remove at ffffffff8219afdb #13 [ffff88aa841efb00] device_release_driver_internal at ffffffff821882e6 #14 [ffff88aa841efb38] bus_remove_device at ffffffff82184278 #15 [ffff88aa841efb88] device_del at ffffffff82179d23 #16 [ffff88aa841efc48] ice_unplug_aux_dev at ffffffffa0eb1c14 [ice] #17 [ffff88aa841efc68] ice_service_task at ffffffffa0d88201 [ice] #18 [ffff88aa841efde8] process_one_work at ffffffff811c589a #19 [ffff88aa841efe60] worker_thread at ffffffff811c71ff #20 [ffff88aa841eff10] kthread at ffffffff811d87a0 #21 [ffff88aa841eff50] ret_from_fork at ffffffff82e0022f Fixes: 44d9e52977a1 ("RDMA/irdma: Implement device initialization definitions") Link: https://lore.kernel.org/r/20231130081415.891006-1-lishifeng@sangfor.com.cn Suggested-by: "Ismail, Mustafa" <mustafa.ismail@intel.com> Signed-off-by: Shifeng Li <lishifeng@sangfor.com.cn> Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-12-05RDMA/irdma: Fix support for 64k pagesMike Marciniszyn1-1/+1
Virtual QP and CQ require a 4K HW page size but the driver passes PAGE_SIZE to ib_umem_find_best_pgsz() instead. Fix this by using the appropriate 4k value in the bitmap passed to ib_umem_find_best_pgsz(). Fixes: 693a5386eff0 ("RDMA/irdma: Split mr alloc and free into new functions") Link: https://lore.kernel.org/r/20231129202143.1434-4-shiraz.saleem@intel.com Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-12-05RDMA/irdma: Ensure iWarp QP queue memory is OS paged alignedMike Marciniszyn1-0/+5
The SQ is shared for between kernel and used by storing the kernel page pointer and passing that to a kmap_atomic(). This then requires that the alignment is PAGE_SIZE aligned. Fix by adding an iWarp specific alignment check. Fixes: e965ef0e7b2c ("RDMA/irdma: Split QP handler into irdma_reg_user_mr_type_qp") Link: https://lore.kernel.org/r/20231129202143.1434-3-shiraz.saleem@intel.com Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-11-28eventfd: simplify eventfd_signal()Christian Brauner1-1/+1
Ever since the eventfd type was introduced back in 2007 in commit e1ad7468c77d ("signal/timer/event: eventfd core") the eventfd_signal() function only ever passed 1 as a value for @n. There's no point in keeping that additional argument. Link: https://lore.kernel.org/r/20231122-vfs-eventfd-signal-v2-2-bd549b14ce0c@kernel.org Acked-by: Xu Yilun <yilun.xu@intel.com> Acked-by: Andrew Donnellan <ajd@linux.ibm.com> # ocxl Acked-by: Eric Farman <farman@linux.ibm.com> # s390 Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-11-22RDMA/irdma: Fix UAF in irdma_sc_ccq_get_cqe_info()Shifeng Li1-3/+3
When removing the irdma driver or unplugging its aux device, the ccq queue is released before destorying the cqp_cmpl_wq queue. But in the window, there may still be completion events for wqes. That will cause a UAF in irdma_sc_ccq_get_cqe_info(). [34693.333191] BUG: KASAN: use-after-free in irdma_sc_ccq_get_cqe_info+0x82f/0x8c0 [irdma] [34693.333194] Read of size 8 at addr ffff889097f80818 by task kworker/u67:1/26327 [34693.333194] [34693.333199] CPU: 9 PID: 26327 Comm: kworker/u67:1 Kdump: loaded Tainted: G O --------- -t - 4.18.0 #1 [34693.333200] Hardware name: SANGFOR Inspur/NULL, BIOS 4.1.13 08/01/2016 [34693.333211] Workqueue: cqp_cmpl_wq cqp_compl_worker [irdma] [34693.333213] Call Trace: [34693.333220] dump_stack+0x71/0xab [34693.333226] print_address_description+0x6b/0x290 [34693.333238] ? irdma_sc_ccq_get_cqe_info+0x82f/0x8c0 [irdma] [34693.333240] kasan_report+0x14a/0x2b0 [34693.333251] irdma_sc_ccq_get_cqe_info+0x82f/0x8c0 [irdma] [34693.333264] ? irdma_free_cqp_request+0x151/0x1e0 [irdma] [34693.333274] irdma_cqp_ce_handler+0x1fb/0x3b0 [irdma] [34693.333285] ? irdma_ctrl_init_hw+0x2c20/0x2c20 [irdma] [34693.333290] ? __schedule+0x836/0x1570 [34693.333293] ? strscpy+0x83/0x180 [34693.333296] process_one_work+0x56a/0x11f0 [34693.333298] worker_thread+0x8f/0xf40 [34693.333301] ? __kthread_parkme+0x78/0xf0 [34693.333303] ? rescuer_thread+0xc50/0xc50 [34693.333305] kthread+0x2a0/0x390 [34693.333308] ? kthread_destroy_worker+0x90/0x90 [34693.333310] ret_from_fork+0x1f/0x40 Fixes: 44d9e52977a1 ("RDMA/irdma: Implement device initialization definitions") Signed-off-by: Shifeng Li <lishifeng1992@126.com> Link: https://lore.kernel.org/r/20231121101236.581694-1-lishifeng1992@126.com Acked-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-11-22RDMA/bnxt_re: Correct module description stringKalesh AP1-1/+1
The word "Driver" is repeated twice in the "modinfo bnxt_re" output description. Fix it. Fixes: 1ac5a4047975 ("RDMA/bnxt_re: Add bnxt_re RoCE driver") Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://lore.kernel.org/r/1700555387-6277-1-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-11-19RDMA/hns: Support SW stats with debugfsJunxian Huang12-43/+220
Support SW stats with debugfs. Query output: $ cat /sys/kernel/debug/hns_roce/hns_0/sw_stat/sw_stat aeqe --- 3341 ceqe --- 0 cmds --- 6764 cmds_err --- 0 posted_mbx --- 3344 polled_mbx --- 3 mbx_event --- 3341 qp_create_err --- 0 qp_modify_err --- 0 cq_create_err --- 0 cq_modify_err --- 0 srq_create_err --- 0 srq_modify_err --- 0 xrcd_alloc_err --- 0 mr_reg_err --- 0 mr_rereg_err --- 0 ah_create_err --- 0 mmap_err --- 0 uctx_alloc_err --- 0 Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Link: https://lore.kernel.org/r/20231114123449.1106162-4-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>