summaryrefslogtreecommitdiff
path: root/drivers/infiniband
AgeCommit message (Collapse)AuthorFilesLines
2016-12-14IB/srp: Make mapping failures easier to debugBart Van Assche1-2/+10
Make it easier to figure out what is going on if memory mapping fails because more memory regions than mr_per_cmd are needed. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-14IB/srp: Make login failures easier to debugBart Van Assche1-0/+3
If login fails because memory region allocation failed it can be hard to figure out what happened. Make it easier to figure out why login failed by logging a message if ib_alloc_mr() fails. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-14IB/srp: Introduce a local variable in srp_add_one()Bart Van Assche1-8/+9
This patch makes the srp_add_one() code more compact and does not change any functionality. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-14IB/srp: Fix CONFIG_DYNAMIC_DEBUG=n buildBart Van Assche1-5/+6
Avoid that the kernel build fails as follows if dynamic debug support is disabled: drivers/infiniband/ulp/srp/ib_srp.c:2272:3: error: implicit declaration of function 'DEFINE_DYNAMIC_DEBUG_METADATA' drivers/infiniband/ulp/srp/ib_srp.c:2272:33: error: 'ddm' undeclared (first use in this function) drivers/infiniband/ulp/srp/ib_srp.c:2275:39: error: '_DPRINTK_FLAGS_PRINT' undeclared (first use in this function) Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-14IB/multicast: Check ib_find_pkey() return valueBart Van Assche1-2/+5
This patch avoids that Coverity complains about not checking the ib_find_pkey() return value. Fixes: commit 547af76521b3 ("IB/multicast: Report errors on multicast groups if P_key changes") Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Sean Hefty <sean.hefty@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-14IPoIB: Avoid reading an uninitialized member variableBart Van Assche1-2/+5
This patch avoids that Coverity reports the following: Using uninitialized value port_attr.state when calling printk Fixes: commit 94232d9ce817 ("IPoIB: Start multicast join process only on active ports") Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Erez Shitrit <erezsh@mellanox.com> Cc: <stable@vger.kernel.org> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-14IB/mad: Fix an array index checkBart Van Assche1-1/+1
The array ib_mad_mgmt_class_table.method_table has MAX_MGMT_CLASS (80) elements. Hence compare the array index with that value instead of with IB_MGMT_MAX_METHODS (128). This patch avoids that Coverity reports the following: Overrunning array class->method_table of 80 8-byte elements at element index 127 (byte offset 1016) using index convert_mgmt_class(mad_hdr->mgmt_class) (which evaluates to 127). Fixes: commit b7ab0b19a85f ("IB/mad: Verify mgmt class in received MADs") Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Sean Hefty <sean.hefty@intel.com> Cc: <stable@vger.kernel.org> Reviewed-by: Hal Rosenstock <hal@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-14IB/mlx4: Rework special QP creation error pathBart Van Assche1-2/+4
The special QP creation error path relies on offset_of(struct mlx4_ib_sqp, qp) == 0. Remove this assumption because that makes the QP creation code easier to understand. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Laurence Oberman <loberman@redhat.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-14IB/srpt: Report login failures only onceBart Van Assche1-13/+9
Report the following message only once if no ACL has been configured yet for an initiator port: "Rejected login because no ACL has been configured yet for initiator %s.\n" Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Nicholas Bellinger <nab@linux-iscsi.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Sagi Grimberg <sagig@grimberg.me> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-14IB/usnic: simplify IS_ERR_OR_NULL to IS_ERRJulia Lawall2-12/+12
The function usnic_ib_qp_grp_get_chunk only returns an ERR_PTR value or a valid pointer, never NULL. The same is true of get_qp_res_chunk, which just returns the result of calling usnic_ib_qp_grp_get_chunk. Simplify IS_ERR_OR_NULL to IS_ERR in both cases. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression t,e; @@ t = \(usnic_ib_qp_grp_get_chunk(...)\|get_qp_res_chunk(...)\) ... when != t=e - IS_ERR_OR_NULL(t) + IS_ERR(t) @@ expression t,e,e1; @@ t = \(usnic_ib_qp_grp_get_chunk(...)\|get_qp_res_chunk(...)\) ... when != t=e ?- t ? PTR_ERR(t) : e1 + PTR_ERR(t) ... when any // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-14IB/core: Issue DREQ when receiving REQ/REP for stale QPHans Westgaard Ry1-1/+23
from "InfiBand Architecture Specifications Volume 1": A QP is said to have a stale connection when only one side has connection information. A stale connection may result if the remote CM had dropped the connection and sent a DREQ but the DREQ was never received by the local CM. Alternatively the remote CM may have lost all record of past connections because its node crashed and rebooted, while the local CM did not become aware of the remote node's reboot and therefore did not clean up stale connections. and: A local CM may receive a REQ/REP for a stale connection. It shall abort the connection issuing REJ to the REQ/REP. It shall then issue DREQ with "DREQ:remote QPN” set to the remote QPN from the REQ/REP. This patch solves a problem with reuse of QPN. Current codebase, that is IPoIB, relies on a REAP-mechanism to do cleanup of the structures in CM. A problem with this is the timeconstants governing this mechanism; they are up to 768 seconds and the interface may look inresponsive in that period. Issuing a DREQ (and receiving a DREP) does the necessary cleanup and the interface comes up. Signed-off-by: Hans Westgaard Ry <hans.westgaard.ry@oracle.com> Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-14IB/nes: use new api ethtool_{get|set}_link_ksettingsPhilippe Reynes1-34/+42
The ethtool api {get|set}_settings is deprecated. We move this driver to new api {get|set}_link_ksettings. Signed-off-by: Philippe Reynes <tremyfr@gmail.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-14IB/isert: do not ignore errors in dma_map_single()Alexey Khoroshilov1-0/+6
There are several places, where errors in dma_map_single() are ignored. The patch fixes them. Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Acked-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-14IB/rdmavt: Only put mmap_info ref if it existsJim Foraker1-1/+2
rvt_create_qp() creates qp->ip only when a qp creation request comes from userspace (udata is not NULL). If we exceed the number of available queue pairs however, the error path always attempts to put a kref to this structure. If the requestor is inside the kernel, this leads to a crash. We fix this by checking that qp->ip is not NULL before caling kref_put(). Signed-off-by: Jim Foraker <foraker1@llnl.gov> Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Acked-by: Jonathan Toppins <jtoppins@redhat.com> Acked-by: Alex Estrin <alex.estrin@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-14IB/rdmavt: Handle the kthread worker using the new APIPetr Mladek1-23/+11
Use the new API to create and destroy the cq kthread worker. The API hides some implementation details. In particular, kthread_create_worker() allocates and initializes struct kthread_worker. It runs the kthread the right way and stores task_struct into the worker structure. In addition, the *on_cpu() variant binds the kthread to the given cpu and the related memory node. kthread_destroy_worker() flushes all pending works, stops the kthread and frees the structure. This patch does not change the existing behavior. Note that we must use the on_cpu() variant because the function starts the kthread and it must bind it to the right CPU before waking. The numa node is associated for given CPU as well. Signed-off-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-14IB/rdmavt: Avoid queuing work into a destroyed cq kthread workerPetr Mladek1-14/+16
The memory barrier is not enough to protect queuing works into a destroyed cq kthread. Just imagine the following situation: CPU1 CPU2 rvt_cq_enter() worker = cq->rdi->worker; rvt_cq_exit() rdi->worker = NULL; smp_wmb(); kthread_flush_worker(worker); kthread_stop(worker->task); kfree(worker); // nothing queued yet => // nothing flushed and // happily stopped and freed if (likely(worker)) { // true => read before CPU2 acted cq->notify = RVT_CQ_NONE; cq->triggered++; kthread_queue_work(worker, &cq->comptask); BANG: worker has been flushed/stopped/freed in the meantime. This patch solves this by protecting the critical sections by rdi->n_cqs_lock. It seems that this lock is not much contended and looks reasonable for this purpose. One catch is that rvt_cq_enter() might be called from IRQ context. Therefore we must always take the lock with IRQs disabled to avoid a possible deadlock. Signed-off-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-14IB/mlx5: avoid bogus -Wmaybe-uninitialized warningArnd Bergmann1-18/+21
We get a false-positive warning in linux-next for the mlx5 driver: infiniband/hw/mlx5/mr.c: In function ‘mlx5_ib_reg_user_mr’: infiniband/hw/mlx5/mr.c:1172:5: error: ‘order’ may be used uninitialized in this function [-Werror=maybe-uninitialized] infiniband/hw/mlx5/mr.c:1161:6: note: ‘order’ was declared here infiniband/hw/mlx5/mr.c:1173:6: error: ‘ncont’ may be used uninitialized in this function [-Werror=maybe-uninitialized] infiniband/hw/mlx5/mr.c:1160:6: note: ‘ncont’ was declared here infiniband/hw/mlx5/mr.c:1173:6: error: ‘page_shift’ may be used uninitialized in this function [-Werror=maybe-uninitialized] infiniband/hw/mlx5/mr.c:1158:6: note: ‘page_shift’ was declared here infiniband/hw/mlx5/mr.c:1143:13: error: ‘npages’ may be used uninitialized in this function [-Werror=maybe-uninitialized] infiniband/hw/mlx5/mr.c:1159:6: note: ‘npages’ was declared here I had a trivial workaround for gcc-5 or higher, but that didn't work on gcc-4.9 unfortunately. The only way I found to avoid the warnings for gcc-4.9, short of initializing each of the arguments first was to change the calling conventions to separate the error code from the umem pointer. This avoids casting the error codes from one pointer to another incompatible pointer, and lets gcc figure out when that the data is actually valid whenever we return successfully. Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-14ib_isert: log the connection reject messageSteve Wise1-0/+2
Acked-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-14ib_iser: log the connection reject messageSteve Wise1-1/+4
Acked-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-14rdma_cm: add rdma_consumer_reject_data helper functionSteve Wise1-0/+16
rdma_consumer_reject_data() will return the private data pointer and length if any is available. Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-14rdma_cm: add rdma_is_consumer_reject() helper functionSteve Wise1-0/+13
Return true if the peer consumer application rejected the connection attempt. Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-14rdma_cm: add rdma_reject_msg() helper functionSteve Wise3-0/+83
rdma_reject_msg() returns a pointer to a string message associated with the transport reject reason codes. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-14qedr: remove pointless NULL check in qedr_post_send()Wei Yongjun1-5/+0
Remove pointless NULL check for 'wr' in qedr_post_send(). Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-14qedr: Use list_move_tail instead of list_del/list_add_tailWei Yongjun1-2/+1
Using list_move_tail() instead of list_del() + list_add_tail(). Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Acked-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-14qedr: Fix possible memory leak in qedr_create_qp()Wei Yongjun1-4/+8
'qp' is malloced in qedr_create_qp() and should be freed before leaving from the error handling cases, otherwise it will cause memory leak. Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-14qedr: return -EINVAL if pd is null and avoid null ptr dereferenceColin Ian King1-1/+3
Currently, if pd is null then we hit a null pointer derference on accessing pd->pd_id. Instead of just printing an error message we should also return -EINVAL immediately. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-14IB/mad: Eliminate redundant SM class version defines for OPAHal Rosenstock2-8/+8
and rename class version define to indicate SM rather than SMP or SMI Signed-off-by: Hal Rosenstock <hal@mellanox.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-13IB/mlx5: Properly adjust rate limit on QP state transitionsBodong Wang3-9/+69
- Add MODIFY_QP_EX CMD to extend modify_qp. - Rate limit will be updated in the following state transactions: RTR2RTS, RTS2RTS. The limit will be removed when SQ is in RST and ERR state. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-13IB/uverbs: Extend modify_qp and support packet pacingBodong Wang3-69/+125
An new uverbs command ib_uverbs_ex_modify_qp is added to support more QP attributes. User driver should choose to call the legacy/extended API based on input mask. IB_USER_LAST_QP_ATTR_MASK is added to indicated the maximum bit position which supports legacy ib_uverbs_modify_qp. IB_USER_LEGACY_LAST_QP_ATTR_MASK indicates the maximum bit position which supports ib_uverbs_ex_modify_qp, the value of this mask should be updated if new mask is added later. Along with this change, rate_limit is supported by the extended command, user driver could use it to control packet packing. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-13IB/core: Support rate limit for packet pacingBodong Wang1-0/+2
Add new member rate_limit to ib_qp_attr which holds the packet pacing rate in kbps, 0 means unlimited. IB_QP_RATE_LIMIT is added to ib_attr_mask and could be used by RAW QPs when changing QP state from RTR to RTS, RTS to RTS. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-13IB/mlx5: Report mlx5 packet pacing capabilities when querying deviceBodong Wang1-0/+13
Enable mlx5 based hardware to report packet pacing capabilities from kernel to user space. Packet pacing allows to limit the rate to any number between the maximum and minimum, based on user settings. The capabilities are exposed to user space through query_device by uhw. The following capabilities are reported: 1. The maximum and minimum rate limit in kbps supported by packet pacing. 2. Bitmap showing which QP types are supported by packet pacing operation. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-13IB/mlx5: Support RAW Ethernet when RoCE is disabledOr Gerlitz1-9/+13
On some environments, such as certain SRIOV VF configurations, RoCE is not supported for mlx5 Ethernet ports. Currently, the driver will not open IB device on that port. This is problematic, since we do want user-space RAW Ethernet (RAW_PACKET QPs) functionality to remain in place. For that end, enhance the relevant driver flows such that we do create a device instance in that case. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-13IB/mlx5: Rename RoCE related helpers to reflect being Eth onesOr Gerlitz1-11/+11
This is a pre-step towards having mlx5 IB device also over Eth ports where RoCE is not supported. We change the roce enable/disable and roce_lag init/fini function names to have _eth instead of _roce. This patch doesn't change any functionality. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-13IB/mlx5: Refactor registration to netdev notifierOr Gerlitz1-9/+20
Refactor the netdev notifier registration into a small helper function. This is a pre-step towards having mlx5 IB device over an Ethernet port which doesn't support RoCE. Also, renamed the de-registration helper and the new helper as netdev notifier and not roce, to make it clear this is not only used with roce. This patch doesn't change any functionality. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-13IB/mlx5: Use u64 for UMR lengthMaor Gottlieb1-1/+1
The fast_registration length is used to convey length for memory registrations through UMR which can be of any size up to 2^64. Change the length type to be u64. Fixes: 968e78dd9644 ('IB/mlx5: Enhance UMR support to allow partial page table update') Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-13IB/mlx5: Avoid system crash when enabling many VFsEli Cohen1-1/+2
When enabling many VFs, the total amount of DMA mappings increase significantly. This causes DMA allocations to take a lot of time since they are serialized in the kernel. As a result the driver enters into fatal condition due to timeout and the system hangs. To recover from this we disable MR cache for VFs. PFs will still have a full cache and VFs cache can be manipulated as usual after driver load. Fixes: e126ba97dba9 ('mlx5: Add driver for Mellanox Connect-IB adapters') Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-13IB/mlx5: Assign SRQ type earlierMaor Gottlieb1-1/+1
Move the SRQ type assignment to be before actually using it in create_srq_user() and in create_srq_kernel() functions. Fixes: af1ba291c5e4 ('{net, IB}/mlx5: Refactor internal SRQ API') Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-13IB/mlx4: Fix out-of-range array index in destroy qp flowJack Morgenstein1-1/+2
For non-special QPs, the port value becomes non-zero only at the RESET-to-INIT transition. If the QP has not undergone that transition, its port number value is still zero. If such a QP is destroyed before being moved out of the RESET state, subtracting one from the qp port number results in a negative value. Using that negative value as an index into the qp1_proxy array results in an out-of-bounds array reference. Fix this by testing that the QP type is one that uses qp1_proxy before using the port number. For special QPs of all types, the port number is specified at QP creation time. Fixes: 9433c188915c ("IB/mlx4: Invoke UPDATE_QP for proxy QP1 on MAC changes") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-13IB/mlx5: Make create/destroy_ah available to userspaceMoni Shoua1-0/+2
Advertise that create_ah and destroy_ah verbs are accessible from uverbs interface. Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-13IB/mlx5: Use kernel driver to help userspace create ahMoni Shoua1-0/+21
Resolving a MAC address for a given IP address in userspace is inefficient. This patch lets mlx5 user driver using the kernel driver to resolve the mac and get the answer in the private section of the response. Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-13IB/core: Let create_ah return extended response to userMoni Shoua20-20/+56
Add struct ib_udata to the signature of create_ah callback that is implemented by IB device drivers. This allows HW drivers to return extra data to the userspace library. This patch prepares the ground for mlx5 driver to resolve destination mac address for a given GID and return it to userspace. This patch was previously submitted by Knut Omang as a part of the patch set to support Oracle's Infiniband HCA (SIF). Signed-off-by: Knut Omang <knut.omang@oracle.com> Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-13IB/mlx5: Report that device has udata response in create_ahMoni Shoua1-1/+2
To make mlx5 user driver aware of whether kernel driver returns dmac in user data response add a new flag that will be returned back to user-space through alloc_ucontext. Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-13IB/core: Change ib_resolve_eth_dmac to use it in create AHMoni Shoua3-48/+47
The function ib_resolve_eth_dmac() requires struct qp_attr * and qp_attr_mask as parameters while the function might be useful to resolve dmac for address handles. This patch changes the signature of the function so it can be used in the flow of creating an address handle. Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-13IB/mlx5: Add support to match inner packet fieldsMoses Reuben1-53/+78
Add support to match packet fields which are tunneled, i.e. support matching the header of the inner packet which is the result of or bit operation of the original header and the IB_FLOW_SPEC_INNER type. The combination of IB_FLOW_SPEC_INNER | IB_FLOW_SPEC_VXLAN_TUNNEL is not needed to be checked, because the IB core has this check already. Signed-off-by: Moses Reuben <mosesr@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-13IB/core: Introduce inner flow steeringMoses Reuben1-1/+3
For a tunneled packet which contains external and internal headers, we refer to the external headers as "outer fields" and the internal headers as "inner fields". Example of a tunneled packet: { L2 | L3 | L4 | tunnel header | L2 | L3 | l4 | data } | | | | | | | { outer fields }{ inner fields } This patch introduces a new flag for flow steering rules - IB_FLOW_SPEC_INNER - which specifies that the rule applies to the inner fields, rather than to the outer fields of the packet. Signed-off-by: Moses Reuben <mosesr@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-13IB/mlx5: Support Vxlan tunneling specificationMoses Reuben1-0/+11
Add support to receive specific Vxlan packet in ConnectX-4. Signed-off-by: Moses Reuben <mosesr@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-13IB/core: Add flow spec tunneling supportMoses Reuben1-0/+15
In order to support tunneling, that can be used by the QP, both struct ib_flow_spec_tunnel and struct ib_flow_tunnel_filter can be used to more IP or UDP based tunneling protocols (e.g NVGRE, GRE, etc). IB_FLOW_SPEC_VXLAN_TUNNEL type flow specification is added to use this functionality and match specific Vxlan packets. In similar to IPv6, we check overflow of the vni value by comparing with the maximum size. Signed-off-by: Moses Reuben <mosesr@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-13IB/mlx5: Add support for CQE compressingBodong Wang1-1/+29
CQE compressing reduces PCI overhead by coalescing and compressing multiple CQEs into a single merged CQE. Successful compressing improves message rate especially for small packet traffic. CQE compressing is supported for all 64B CQE formats (with certain limitations) generated by RQ/Responder or by SQ/Requestor. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-13IB/mlx5: Report mlx5 CQE compression caps during queryBodong Wang1-0/+10
The capabilities include: - Max number of compressed and aggregated CQEs in a single session, while zero means unsupported. - For Responder, there are two formats of mini CQE: mini CQE with Rx hash and mini CQE with checksum. They're mutual exclusive. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-13IB/mlx5: Report mlx5 multi packet WQE caps during queryBodong Wang1-0/+11
The capabilities whether hardware support multi packet WQE or not is exposed to user space through query_device by uhw. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>