summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2010-03-02Merge branch 'iser' into for-nextRoland Dreier5-604/+391
2010-03-02Merge branch 'ipoib' into for-nextRoland Dreier1-8/+2
2010-03-02Merge branch 'ehca' into for-nextRoland Dreier3-7/+4
2010-03-02Merge branch 'cxgb3' into for-nextRoland Dreier13-24/+203
2010-03-02Merge branch 'cma' into for-nextRoland Dreier1-1/+0
2010-02-24RDMA/cxgb3: Mark RDMA device with CXIO_ERROR_FATAL when removingSteve Wise1-0/+1
If cxgb3 calls the iw_cxgb3 t3cclient remove function due to a device removal event, then the iwch device must be marked with CXIO_ERROR_FATAL since the device below us is going away. Otherwise, we can get stuck in a deadlock as RDMA ULPs try and deallocate objects (like MRs, QPs, etc). So always mark the device with CXIO_ERROR_FATAL when removing. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24RDMA/cxgb3: Don't allocate the SW queue for user mode CQsSteve Wise3-9/+9
Only kernel mode CQs need the SW queue memory allocated. The SW queue for user mode CQs is allocated in userspace by libcxgb3. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24RDMA/cxgb3: Increase the max CQ depthSteve Wise1-1/+1
Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24RDMA/cxgb3: Doorbell overflow avoidance and recoverySteve Wise10-13/+192
T3 hardware doorbell FIFO overflows can cause application stalls due to lost doorbell ring events. This has been seen when running large NP IMB alltoall MPI jobs. The T3 hardware supports an xon/xoff-type flow control mechanism to help avoid overflowing the HW doorbell FIFO. This patch uses these interrupts to disable RDMA QP doorbell rings when we near an overflow condition, and then turn them back on (and ring all the active QP doorbells) when when the doorbell FIFO empties out. In addition if an doorbell ring is dropped by the hardware, the code will now recover. Design: cxgb3: - enable these DB interrupts - in the interrupt handler, schedule work tasks to call the ULPs event handlers with the new events. - ring all the qset txqs when an overflow is detected. iw_cxgb3: - disable db ringing on all active qps when we get the DB_FULL event - enable db ringing on all active qps and ring all active dbs when we get the DB_EMPTY event - On DB_DROP event: - disable db rings in the event handler - delay-schedule a work task which rings and enables the dbs on all active qps. - in post_send and post_recv logic, don't ring the db if it's disabled. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24IB/core: Pack struct ib_device a little tighterAlexander Chiang1-2/+2
A small change to reduce the size of ib_device to 1112 bytes (from 1128). Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24IB/ucm: Clean whitespace errorsAlexander Chiang1-3/+3
As shown when 'let c_space_errors=1' is set in vim. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24IB/ucm: Increase maximum devices supportedAlexander Chiang1-8/+45
Some large systems may support more than IB_UCM_MAX_DEVICES (currently 32). This change allows us to support more devices in a backwards-compatible manner. the first IB_UCM_MAX_DEVICES keep the same major/minor device numbers they've always had. If there are more than IB_UCM_MAX_DEVICES, then we dynamically request a new major device number (new minors start at 0). Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24IB/ucm: Use stack variable 'base' in ib_ucm_add_oneAlexander Chiang1-1/+3
This change is not useful by itself, but sets us up for a future change that allows us to support more than IB_UCM_MAX_DEVICES. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24IB/ucm: Use stack variable 'devnum' in ib_ucm_add_oneAlexander Chiang1-5/+7
This change is not useful by itself, but sets us up for a future change that allows us to dynamically allocate device numbers in case we have more than IB_UCM_MAX_DEVICES in the system. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24IB/umad: Clean whitespaceAlexander Chiang1-13/+13
Clean errors as shown when 'let c_space_errors=1' is set in vim. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24IB/umad: Increase maximum devices supportedAlexander Chiang1-6/+44
Some large systems may support more than IB_UMAD_MAX_PORTS (currently 64). This change allows us to support more ports in a backwards-compatible manner. The first IB_UMAD_MAX_PORTS keep the same major/minor device numbers they've always had. If there are more than IB_UMAD_MAX_PORTS, we then dynamically request a new major device number (new minors start at 0). Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24IB/umad: Use stack variable 'base' in ib_umad_init_portAlexander Chiang1-2/+5
This change is not useful by itself, but sets us up for a future change that allows us to support more than IB_UMAD_MAX_PORTS in a system. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24IB/umad: Use stack variable 'devnum' in ib_umad_init_portAlexander Chiang1-6/+9
This change is not useful by itself, but sets us up for a future change that allows us to dynamically allocate device numbers in case we have more than IB_UMAD_MAX_PORTS in the system. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24IB/umad: Remove port_table[]Alexander Chiang1-36/+9
We no longer need this data structure, as it was used to associate an inode back to a struct ib_umad_port during ->open(). But now that we're embedding a struct cdev in struct ib_umad_port, we can use the container_of() macro to go from the inode back to the device instead. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24IB/umad: Convert *cdev to cdev in struct ib_umad_portAlexander Chiang1-26/+20
Instead of storing pointers to cdev and sm_cdev, embed the full structures instead. This change allows us to use the container_of() macro in ib_umad_open() and ib_umad_sm_open() in a future patch. This change increases the size of struct ib_umad_port to 320 bytes from 128. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24IB/uverbs: Whitespace cleanupAlexander Chiang1-34/+34
Clean up the errors as shown when 'let c_space_errors=1' is set in vim. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24IB/uverbs: Pack struct ib_uverbs_event_file tighterAlexander Chiang1-2/+2
Eliminate some padding in the structure by rearranging the members. sizeof(struct ib_uverbs_event_file) is now 72 bytes (from 80) and more members now fit in the first cacheline. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24IB/uverbs: Increase maximum devices supportedAlexander Chiang1-6/+50
Some large systems may support more than IB_UVERBS_MAX_DEVICES (currently 32). This change allows us to support more devices in a backwards-compatible manner. The first IB_UVERBS_MAX_DEVICES keep the same major/minor device numbers that they've always had. If there are more than IB_UVERBS_MAX_DEVICES, we then dynamically request a new major device number (new minors start at 0). This change increases the maximum number of HCAs to 64 (from 32). Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24IB/uverbs: use stack variable 'base' in ib_uverbs_add_oneAlexander Chiang1-1/+3
This change is not useful by itself, but sets us up for a future change that allows us to support more than IB_UVERBS_MAX_DEVICES in a system. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24IB/uverbs: Use stack variable 'devnum' in ib_uverbs_add_oneAlexander Chiang1-5/+7
This change is not useful by itself, but it sets us up for a future change that allows us to dynamically allocate device numbers in case we have more than IB_UVERBS_MAX_DEVICES in the system. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24IB/uverbs: Remove dev_tableAlexander Chiang1-19/+5
dev_table's raison d'etre was to associate an inode back to a struct ib_uverbs_device. However, now that we've converted ib_uverbs_device to contain an embedded cdev (instead of a *cdev), we can use the container_of() macro and cast back to the containing device. There's no longer any need for dev_table, so get rid of it. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24IB/uverbs: Convert *cdev to cdev in struct ib_uverbs_deviceAlexander Chiang2-16/+14
Instead of storing a pointer to a cdev, embed the entire struct cdev. This change allows us to use the container_of() macro in ib_uverbs_open() in a future patch. This change increases the size of struct ib_uverbs_device to 168 bytes across 3 cachelines from 80 bytes in 2 cachelines. However, we rearrange the members so that everything fits into the first cacheline except for the struct cdev. Finally, we don't touch the cdev in any fastpaths, so this change shouldn't negatively affect performance. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24IB/iser: Remove redundant locking from iser scsi command response flowOr Gerlitz1-25/+0
Currently the iSER receive completion flow takes the session lock twice. Optimize it to avoid the first one by letting iser_task_rdma_finalize() be called only from the cleanup_task callback invoked by iscsi_free_task, thus reducing the contention on the session lock between the scsi command submission to the scsi command completion flows. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24IB/iser: Use libiscsi passthrough modeOr Gerlitz2-20/+3
libiscsi passthrough mode invokes the transport xmit calls directly without first going through an internal queue, unlike the other mode, which uses a queue and a xmitworker thread. Now that the "cant_sleep" prerequisite of iscsi_host_alloc is met, move to use it. Handling xmit errors is now done by the passthrough flow of libiscsi. Since the queue/worker aren't used in this mode, the code that schedules the xmitworker is removed. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24IB/iser: Remove unnecessary connection checksOr Gerlitz3-52/+0
Remove unnecessary checks for the IB connection state and for QP overflow, as conn state changes are reported by iSER to libiscsi and handled there. QP overflow is theoretically possible only when unsolicited data-outs are used; anyway it's being checked and handled by HW drivers. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24IB/iser: Use atomic allocationsOr Gerlitz2-3/+3
Two minor flows in iSER's data path still use allocations; move them to be atomic as a preperation step towards moving to use libiscsi passthrough mode. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24IB/iser: Simplify send flow/descriptorsOr Gerlitz5-278/+115
Simplify and shrink the logic/code used for the send descriptors. Changes include removing struct iser_dto (an unnecessary abstraction), using struct iser_regd_buf only for handling SCSI commands, using dma_sync instead of dma_map/unmap, etc. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24IB/iser: Use different CQ for send completionsOr Gerlitz2-37/+76
Use a different CQ for send completions, where send completions are polled by the interrupt-driven receive completion handler. Therefore, interrupts aren't used for the send CQ. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24IB/iser: Remove atomic counter for posted receive buffersOr Gerlitz3-12/+12
Now that both the posting and reaping of receive buffers is done in the completion path, the counter of outstanding buffers not be atomic. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24IB/iser: New receive buffer posting logicOr Gerlitz4-176/+235
Currently, the recv buffer posting logic is based on the transactional nature of iSER which allows for posting a buffer before sending a PDU. Change this to post only when the number of outstanding recv buffers is below a water mark and in a batched manner, thus simplifying and optimizing the data path. Use a pre-allocated ring of recv buffers instead of allocating from kmem cache. A special treatment is given to the login response buffer whose size must be 8K unlike the size of buffers used for any other purpose which is 128 bytes. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24IB/iser: Revert commit bba7ebb "avoid recv buffer exhaustion"Or Gerlitz3-95/+41
We will make a major change in the recv buffer posting logic, after which the problem commit bba7ebb "avoid recv buffer exhaustion caused by unexpected PDUs" comes to solve doesn't exist any more, so revert it. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-19IB/ehca: Require in_wc in process_mad()Alexander Schmidt1-1/+1
If the caller does not pass a valid in_wc to process_mad(), return MAD failure status, as it is not possible to generate a valid MAD redirect response (and redirects are the only MAD responses ehca generates). Signed-off-by: Alexander Schmidt <alexs@linux.vnet.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-13IB/ehca: Allow access for ib_query_qp()Alexander Schmidt1-3/+1
The max_dest_rd_atomic and max_qp_rd_atomic values are properly returned by query_qp(), so there should not be an error returned when they are queried. Signed-off-by: Alexander Schmidt <alexs@linux.vnet.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-13IB/ehca: Do not turn off irqs in tasklet contextAlexander Schmidt1-3/+2
The irq_spinlock is only taken in tasklet context, so it is safe not to disable hardware interrupts. Signed-off-by: Alexander Schmidt <alexs@linux.vnet.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-12IPoIB: Remove TX moderation settings from ethtool supportOr Gerlitz1-8/+2
As of commit f56bcd8 ("IPoIB: Use separate CQ for UD send completions"), there are no TX interrupts. Change the ethtool code not to report TX moderation settings, so users will not be misled to think they can control TX interrupt moderation. Pointed out by Alex Vainman <alexv@voltaire.com> Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-12RDMA/cxgb3: Remove BUG_ON() on CQ rearm failureSteve Wise1-1/+0
Failure to rearm a CQ means the cxgb3 device is wedged, but we shouldn't kill the whole system with a BUG_ON() if this happens. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-12RDMA/cm: Remove unused definition of RDMA_PS_SCTPSean Hefty1-1/+0
The defined SCTP number is incorrect (0x83, rather than 0x84), and since it is not used anywhere, simply remove the definition. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-12Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bpLinus Torvalds1-7/+8
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: amd64_edac: Do not falsely trigger kerneloops
2010-02-12Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds2-33/+19
* 'for-linus' of git://git.kernel.dk/linux-2.6-block: cciss: Make cciss_seq_show handle holes in the h->drv[] array cfq-iosched: split seeky coop queues after one slice
2010-02-12Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6Linus Torvalds5-12/+10
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: NFS: Fix the mapping of the NFSERR_SERVERFAULT error NFS: Remove a redundant check for PageFsCache in nfs_migrate_page() NFS: Fix a bug in nfs_fscache_release_page()
2010-02-12Merge git://git.infradead.org/users/cbou/battery-2.6.33Linus Torvalds1-2/+8
* git://git.infradead.org/users/cbou/battery-2.6.33: wm97xx_battery: Handle missing platform data gracefully
2010-02-12Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6Linus Torvalds6-44/+26
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6: [SCSI] qla2xxx: Obtain proper host structure during response-queue processing. [SCSI] compat_ioct: fix bsg SG_IO [SCSI] qla2xxx: make msix interrupt handler safe for irq [SCSI] zfcp: Report FC BSG errors in correct field [SCSI] mptfusion : mptscsih_abort return value should be SUCCESS instead of value 0.
2010-02-12Merge branch 'for-linus' of ↵Linus Torvalds1-1/+8
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: psmouse - make sure we don't schedule reconnects after cleanup
2010-02-12Merge branch 'drm-linus' of ↵Linus Torvalds43-230/+402
git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 * 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (30 commits) vgaarb: fix incorrect dereference of userspace pointer. drm/radeon/kms: retry auxch on 0x20 timeout value. drm/radeon: Skip dma copy test in benchmark if card doesn't have dma engine. drm/vmwgfx: Fix a circular locking dependency bug. drm/vmwgfx: Drop scanout flag compat and add execbuf ioctl parameter members. Bumps major. drm/vmwgfx: Report propper framebuffer_{max|min}_{width|height} drm/vmwgfx: Update the user-space interface. drm/radeon/kms: fix screen clearing before fbcon. nouveau: fix state detection with switchable graphics drm/nouveau: move dereferences after null checks drm/nv50: make the pgraph irq handler loop like the pre-nv50 version drm/nv50: delete ramfc object after disabling fifo, not before drm/nv50: avoid unloading pgraph context when ctxprog is running drm/nv50: align size of buffer object to the right boundaries. drm/nv50: disregard dac outputs in nv50_sor_dpms() drm/nv50: prevent multiple init tables being parsed at the same time drm/nouveau: make dp auxch xfer len check for reads only drm/nv40: make INIT_COMPUTE_MEM a NOP, just like nv50 drm/nouveau: Add proper vgaarb support. drm/nouveau: Fix fbcon on mixed pre-NV50 + NV50 multicard. ...
2010-02-12Merge branch 'fixes' of ↵Linus Torvalds5-22/+10
git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx: drivers/dma: Correct NULL test async-tx: fix buffer submission error handling in ipu_idma.c dmaengine: correct onstack wait_queue_head declaration ioat: fix infinite timeout checking in ioat2_quiesce dmaengine: fix memleak in dma_async_device_unregister