summaryrefslogtreecommitdiff
path: root/include/linux/sunrpc/xprt.h
AgeCommit message (Collapse)AuthorFilesLines
2024-02-28SUNRPC: Add a transport callback to handle dequeuing of an RPC requestTrond Myklebust1-0/+1
Add a transport level callback to allow it to handle the consequences of dequeuing the request that was in the process of being transmitted. For something like a TCP connection, we may need to disconnect if the request was partially transmitted. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2024-01-05NFSv4.1: Use the nfs_client's rpc timeouts for backchannelBenjamin Coddington1-11/+0
For backchannel requests that lookup the appropriate nfs_client, use the state-management rpc_clnt's rpc_timeout parameters for the backchannel's response. When the nfs_client cannot be found, fall back to using the xprt's default timeout parameters. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2023-10-16SUNRPC: change the back-channel queue to lwqNeilBrown1-1/+2
This removes the need to store and update back-links in the list. It also remove the need for the _bh version of spin_lock(). Signed-off-by: NeilBrown <neilb@suse.de> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-08-24SUNRPC: Allow specification of TCP client connect timeout at setupTrond Myklebust1-0/+2
When we create a TCP transport, the connect timeout parameters are currently fixed to be 90s. This is problematic in the pNFS flexfiles case, where we may have multiple mirrors, and we would like to fail over quickly to the next mirror if a data server is down. This patch adds the ability to specify the connection parameters at RPC client creation time. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2023-06-19SUNRPC: Add a TCP-with-TLS RPC transport classChuck Lever1-0/+1
Use the new TLS handshake API to enable the SunRPC client code to request a TLS handshake. This implements support for RFC 9289, only on TCP sockets. Upper layers such as NFS use RPC-with-TLS to protect in-transit traffic. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2023-06-19SUNRPC: Plumb an API for setting transport layer securityChuck Lever1-0/+17
Add an initial set of policies along with fields for upper layers to pass the requested policy down to the transport layer. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2022-07-27SUNRPC: Don't reuse bvec on retransmission of the requestTrond Myklebust1-1/+2
If a request is re-encoded and then retransmitted, we need to make sure that we also re-encode the bvec, in case the page lists have changed. Fixes: ff053dbbaffe ("SUNRPC: Move the call to xprt_send_pagedata() out of xprt_sock_sendmsg()") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2022-07-25SUNRPC expose functions for offline remote xprt functionalityOlga Kornievskaia1-0/+3
Re-arrange the code that make offline transport and delete transport callable functions. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2022-04-08Merge tag 'nfs-for-5.18-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds1-3/+2
Pull NFS client fixes from Trond Myklebust: "Stable fixes: - SUNRPC: Ensure we flush any closed sockets before xs_xprt_free() Bugfixes: - Fix an Oopsable condition due to SLAB_ACCOUNT setting in the NFSv4.2 xattr code. - Fix for open() using an file open mode of '3' in NFSv4 - Replace readdir's use of xxhash() with hash_64() - Several patches to handle malloc() failure in SUNRPC" * tag 'nfs-for-5.18-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: SUNRPC: Move the call to xprt_send_pagedata() out of xprt_sock_sendmsg() SUNRPC: svc_tcp_sendmsg() should handle errors from xdr_alloc_bvec() SUNRPC: Handle allocation failure in rpc_new_task() NFS: Ensure rpc_run_task() cannot fail in nfs_async_rename() NFSv4/pnfs: Handle RPC allocation errors in nfs4_proc_layoutget SUNRPC: Handle low memory situations in call_status() SUNRPC: Handle ENOMEM in call_transmit_status() NFSv4.2: Fix missing removal of SLAB_ACCOUNT on kmem_cache allocation SUNRPC: Ensure we flush any closed sockets before xs_xprt_free() NFS: Replace readdir's use of xxhash() with hash_64() SUNRPC: handle malloc failure in ->request_prepare NFSv4: fix open failure with O_ACCMODE flag Revert "NFSv4: Handle the special Linux file open access mode"
2022-03-30SUNRPC: handle malloc failure in ->request_prepareNeilBrown1-3/+2
If ->request_prepare() detects an error, it sets ->rq_task->tk_status. This is easy for callers to ignore. The only caller is xprt_request_enqueue_receive() and it does ignore the error, as does call_encode() which calls it. This can result in a request being queued to receive a reply without an allocated receive buffer. So instead of setting rq_task->tk_status, return an error, and store in ->tk_status only in call_encode(); The call to xprt_request_enqueue_receive() is now earlier in call_encode(), where the error can still be handled. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2022-03-30Merge tag 'nfs-for-5.18-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds1-0/+3
Pull NFS client updates from Trond Myklebust: "Highlights include: Features: - Switch NFS to use readahead instead of the obsolete readpages. - Readdir fixes to improve cacheability of large directories when there are multiple readers and writers. - Readdir performance improvements when doing a seekdir() immediately after opening the directory (common when re-exporting NFS). - NFS swap improvements from Neil Brown. - Loosen up memory allocation to permit direct reclaim and write back in cases where there is no danger of deadlocking the writeback code or NFS swap. - Avoid sillyrename when the NFSv4 server claims to support the necessary features to recover the unlinked but open file after reboot. Bugfixes: - Patch from Olga to add a mount option to control NFSv4.1 session trunking discovery, and default it to being off. - Fix a lockup in nfs_do_recoalesce(). - Two fixes for list iterator variables being used when pointing to the list head. - Fix a kernel memory scribble when reading from a non-socket transport in /sys/kernel/sunrpc. - Fix a race where reconnecting to a server could leave the TCP socket stuck forever in the connecting state. - Patch from Neil to fix a shutdown race which can leave the SUNRPC transport timer primed after we free the struct xprt itself. - Patch from Xin Xiong to fix reference count leaks in the NFSv4.2 copy offload. - Sunrpc patch from Olga to avoid resending a task on an offlined transport. Cleanups: - Patches from Dave Wysochanski to clean up the fscache code" * tag 'nfs-for-5.18-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (91 commits) NFSv4/pNFS: Fix another issue with a list iterator pointing to the head NFS: Don't loop forever in nfs_do_recoalesce() SUNRPC: Don't return error values in sysfs read of closed files SUNRPC: Do not dereference non-socket transports in sysfs NFSv4.1: don't retry BIND_CONN_TO_SESSION on session error SUNRPC don't resend a task on an offlined transport NFS: replace usage of found with dedicated list iterator variable SUNRPC: avoid race between mod_timer() and del_timer_sync() pNFS/files: Ensure pNFS allocation modes are consistent with nfsiod pNFS/flexfiles: Ensure pNFS allocation modes are consistent with nfsiod NFSv4/pnfs: Ensure pNFS allocation modes are consistent with nfsiod NFS: Avoid writeback threads getting stuck in mempool_alloc() NFS: nfsiod should not block forever in mempool_alloc() SUNRPC: Make the rpciod and xprtiod slab allocation modes consistent SUNRPC: Fix unx_lookup_cred() allocation NFS: Fix memory allocation in rpc_alloc_task() NFS: Fix memory allocation in rpc_malloc() SUNRPC: Improve accuracy of socket ENOBUFS determination SUNRPC: Replace internal use of SOCKWQ_ASYNC_NOSPACE SUNRPC: Fix socket waits for write buffer space ...
2022-03-25SUNRPC: Do not dereference non-socket transports in sysfsTrond Myklebust1-0/+3
Do not cast the struct xprt to a sock_xprt unless we know it is a UDP or TCP transport. Otherwise the call to lock the mutex will scribble over whatever structure is actually there. This has been seen to cause hard system lockups when the underlying transport was RDMA. Fixes: b49ea673e119 ("SUNRPC: lock against ->sock changing during sysfs read") Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2022-01-28SUNRPC: add netns refcount tracker to struct rpc_xprtEric Dumazet1-0/+1
Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-04Merge tag 'nfs-for-5.15-1' of git://git.linux-nfs.org/projects/anna/linux-nfsLinus Torvalds1-0/+1
Pull NFS client updates from Anna Schumaker: "New Features: - Better client responsiveness when server isn't replying - Use refcount_t in sunrpc rpc_client refcount tracking - Add srcaddr and dst_port to the sunrpc sysfs info files - Add basic support for connection sharing between servers with multiple NICs` Bugfixes and Cleanups: - Sunrpc tracepoint cleanups - Disconnect after ib_post_send() errors to avoid deadlocks - Fix for tearing down rpcrdma_reps - Fix a potential pNFS layoutget livelock loop - pNFS layout barrier fixes - Fix a potential memory corruption in rpc_wake_up_queued_task_set_status() - Fix reconnection locking - Fix return value of get_srcport() - Remove rpcrdma_post_sends() - Remove pNFS dead code - Remove copy size restriction for inter-server copies - Overhaul the NFS callback service - Clean up sunrpc TCP socket shutdowns - Always provide aligned buffers to RPC read layers" * tag 'nfs-for-5.15-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (39 commits) NFS: Always provide aligned buffers to the RPC read layers NFSv4.1 add network transport when session trunking is detected SUNRPC enforce creation of no more than max_connect xprts NFSv4 introduce max_connect mount options SUNRPC add xps_nunique_destaddr_xprts to xprt_switch_info in sysfs SUNRPC keep track of number of transports to unique addresses NFSv3: Delete duplicate judgement in nfs3_async_handle_jukebox SUNRPC: Tweak TCP socket shutdown in the RPC client SUNRPC: Simplify socket shutdown when not reusing TCP ports NFSv4.2: remove restriction of copy size for inter-server copy. NFS: Clean up the synopsis of callback process_op() NFS: Extract the xdr_init_encode/decode() calls from decode_compound NFS: Remove unused callback void decoder NFS: Add a private local dispatcher for NFSv4 callback operations SUNRPC: Eliminate the RQ_AUTHERR flag SUNRPC: Set rq_auth_stat in the pg_authenticate() callout SUNRPC: Add svc_rqst::rq_auth_stat SUNRPC: Add dst_port to the sysfs xprt info file SUNRPC: Add srcaddr as a file in sysfs sunrpc: Fix return value of get_srcport() ...
2021-08-20SUNRPC: Move client-side disconnect injectionChuck Lever1-18/+0
Disconnect injection stress-tests the ability for both client and server implementations to behave resiliently in the face of network instability. Convert the existing client-side disconnect injection infrastructure to use the kernel's generic error injection facility. The generic facility has a richer set of injection criteria. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-08-09SUNRPC: Fix potential memory corruptionTrond Myklebust1-0/+1
We really should not call rpc_wake_up_queued_task_set_status() with xprt->snd_task as an argument unless we are certain that is actually an rpc_task. Fixes: 0445f92c5d53 ("SUNRPC: Fix disconnection races") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2021-07-08sunrpc: remove an offlined xprt using sysfsOlga Kornievskaia1-0/+1
Once a transport has been put offline, this transport can be also removed from the list of transports. Any tasks that have been stuck on this transport would find the next available active transport and be re-tried. This transport would be removed from the xprt_switch list and freed. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-07-08SUNRPC: take a xprt offline using sysfsOlga Kornievskaia1-0/+1
Using sysfs's xprt_state attribute, mark a particular transport offline. It will not be picked during the round-robin selection. It's not allowed to take the main (1st created transport associated with the rpc_client) offline. Also bring a transport back online via sysfs by writing "online" and that would allow for this transport to be picked during the round- robin selection. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-07-08SUNRPC mark the first transportOlga Kornievskaia1-0/+1
When an RPC client gets created it's first transport is special and should be marked a main transport. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-07-08sunrpc: add dst_attr attributes to the sysfs xprt directoryOlga Kornievskaia1-0/+1
Allow to query and set the destination's address of a transport. Setting of the destination address is allowed only for TCP or RDMA based connections. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-07-08sunrpc: add add sysfs directory per xprt under each xprt_switchOlga Kornievskaia1-0/+2
Add individual transport directories under each transport switch group. For instance, for each nconnect=X connections there will be a transport directory. Naming conventions also identifies transport type -- xprt-<id>-<type> where type is udp, tcp, rdma, local, bc. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-07-08sunrpc: keep track of the xprt_class in rpc_xprt structureOlga Kornievskaia1-0/+2
We need to keep track of the type for a given transport. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-07-08sunrpc: add xprt idOlga Kornievskaia1-0/+2
This adds a unique identifier for a sunrpc transport in sysfs, which is similarly managed to the unique IDs of clients. Signed-off-by: Dan Aloni <dan@kernelim.com> Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-05-26SUNRPC: More fixes for backlog congestionTrond Myklebust1-0/+2
Ensure that we fix the XPRT_CONGESTED starvation issue for RDMA as well as socket based transports. Ensure we always initialise the request after waking up from the backlog list. Fixes: e877a88d1f06 ("SUNRPC in case of backlog, hand free slots directly to waiting task") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-04-05SUNRPC: Set TCP_CORK until the transmit queue is emptyTrond Myklebust1-0/+1
When we have multiple RPC requests queued up, it makes sense to set the TCP_CORK option while the transmit queue is non-empty. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-12-02SUNRPC: Remove unused function xprt_load_transport()Trond Myklebust1-1/+0
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-12-02SUNRPC: Add a helper to return the transport identifier given a netidTrond Myklebust1-0/+1
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-12-02SUNRPC: xprt_load_transport() needs to support the netid "rdma6"Trond Myklebust1-0/+1
According to RFC5666, the correct netid for an IPv6 addressed RDMA transport is "rdma6", which we've supported as a mount option since Linux-4.7. The problem is when we try to load the module "xprtrdma6", that will fail, since there is no modulealias of that name. Fixes: 181342c5ebe8 ("xprtrdma: Add rdma6 option to support NFS/RDMA IPv6") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-08-05SUNRPC dont update timeout value on connection resetOlga Kornievskaia1-0/+1
Current behaviour: every time a v3 operation is re-sent to the server we update (double) the timeout. There is no distinction between whether or not the previous timer had expired before the re-sent happened. Here's the scenario: 1. Client sends a v3 operation 2. Server RST-s the connection (prior to the timeout) (eg., connection is immediately reset) 3. Client re-sends a v3 operation but the timeout is now 120sec. As a result, an application sees 2mins pause before a retry in case server again does not reply. Instead, this patch proposes to keep track off when the minor timeout should happen and if it didn't, then don't update the new timeout. Value is updated based on the previous value to make timeouts predictable. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-12-08Merge tag 'nfsd-5.5' of git://linux-nfs.org/~bfields/linuxLinus Torvalds1-4/+0
Pull nfsd updates from Bruce Fields: "This is a relatively quiet cycle for nfsd, mainly various bugfixes. Possibly most interesting is Trond's fixes for some callback races that were due to my incomplete understanding of rpc client shutdown. Unfortunately at the last minute I've started noticing a new intermittent failure to send callbacks. As the logic seems basically correct, I'm leaving Trond's patches in for now, and hope to find a fix in the next week so I don't have to revert those patches" * tag 'nfsd-5.5' of git://linux-nfs.org/~bfields/linux: (24 commits) nfsd: depend on CRYPTO_MD5 for legacy client tracking NFSD fixing possible null pointer derefering in copy offload nfsd: check for EBUSY from vfs_rmdir/vfs_unink. nfsd: Ensure CLONE persists data and metadata changes to the target file SUNRPC: Fix backchannel latency metrics nfsd: restore NFSv3 ACL support nfsd: v4 support requires CRYPTO_SHA256 nfsd: Fix cld_net->cn_tfm initialization lockd: remove __KERNEL__ ifdefs sunrpc: remove __KERNEL__ ifdefs race in exportfs_decode_fh() nfsd: Drop LIST_HEAD where the variable it declares is never used. nfsd: document callback_wq serialization of callback code nfsd: mark cb path down on unknown errors nfsd: Fix races between nfsd4_cb_release() and nfsd4_shutdown_callback() nfsd: minor 4.1 callback cleanup SUNRPC: Fix svcauth_gss_proxy_init() SUNRPC: Trace gssproxy upcall results sunrpc: fix crash when cache_head become valid before update nfsd: remove private bin2hex implementation ...
2019-11-12sunrpc: remove __KERNEL__ ifdefsChristoph Hellwig1-4/+0
Remove the __KERNEL__ ifdefs from the non-UAPI sunrpc headers, as those can't be included from user space programs. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-11-04NFSv4.1: Don't rebind to the same source port when reconnecting to the serverTrond Myklebust1-1/+2
NFSv2, v3 and NFSv4 servers often have duplicate replay caches that look at the source port when deciding whether or not an RPC call is a replay of a previous call. This requires clients to perform strange TCP gymnastics in order to ensure that when they reconnect to the server, they bind to the same source port. NFSv4.1 and NFSv4.2 have sessions that provide proper replay semantics, that do not look at the source port of the connection. This patch therefore ensures they can ignore the rebind requirement. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-09-17SUNRPC: Dequeue the request from the receive queue while we're re-encodingTrond Myklebust1-0/+1
Ensure that we dequeue the request from the transport receive queue while we're re-encoding to prevent issues like use-after-free when we release the bvec. Fixes: 7536908982047 ("SUNRPC: Ensure the bvecs are reset when we re-encode...") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: stable@vger.kernel.org # v4.20+ Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-07-18SUNRPC: Fix up backchannel slot table accountingTrond Myklebust1-2/+4
Add a per-transport maximum limit in the socket case, and add helpers to allow the NFSv4 code to discover that limit. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-07-12Merge tag 'nfs-rdma-for-5.3-1' of ↵Trond Myklebust1-0/+3
git://git.linux-nfs.org/projects/anna/linux-nfs NFSoRDMA client updates for 5.3 New features: - Add a way to place MRs back on the free list - Reduce context switching - Add new trace events Bugfixes and cleanups: - Fix a BUG when tracing is enabled with NFSv4.1 - Fix a use-after-free in rpcrdma_post_recvs - Replace use of xdr_stream_pos in rpcrdma_marshal_req - Fix occasional transport deadlock - Fix show_nfs_errors macros, other tracing improvements - Remove RPCRDMA_REQ_F_PENDING and fr_state - Various simplifications and refactors
2019-07-09xprtrdma: Modernize ops->connectChuck Lever1-0/+3
Adapt and apply changes that were made to the TCP socket connect code. See the following commits for details on the purpose of these changes: Commit 7196dbb02ea0 ("SUNRPC: Allow changing of the TCP timeout parameters on the fly") Commit 3851f1cdb2b8 ("SUNRPC: Limit the reconnect backoff timer to the max RPC message timeout") Commit 02910177aede ("SUNRPC: Fix reconnection timeouts") Some common transport code is moved to xprt.c to satisfy the code duplication police. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-07-06SUNRPC: Add basic load balancing to the transport switchTrond Myklebust1-0/+1
For now, just count the queue length. It is less accurate than counting number of bytes queued, but easier to implement. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2019-04-25SUNRPC: Refactor xprt_request_wait_receive()Trond Myklebust1-3/+3
Convert the transport callback to actually put the request to sleep instead of just setting a timeout. This is in preparation for rpc_sleep_on_timeout(). Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-02-13SUNRPC: Remove rpc_xprt::tsh_sizeChuck Lever1-7/+0
tsh_size was added to accommodate transports that send a pre-amble before each RPC message. However, this assumes the pre-amble is fixed in size, which isn't true for some transports. That makes tsh_size not very generic. Also I'd like to make the estimation of RPC send and receive buffer sizes more precise. tsh_size doesn't currently appear to be accounted for at all by call_allocate. Therefore let's just remove the tsh_size concept, and make the only transports that have a non-zero tsh_size employ a direct approach. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-12-28sunrpc: remove unused bc_up operation from rpc_xprt_opsVasily Averin1-1/+0
Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-09-30SUNRPC: Add a bvec array to struct xdr_buf for use with iovec_iter()Trond Myklebust1-0/+2
Add a bvec array to struct xdr_buf, and have the client allocate it when we need to receive data into pages. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-09-30SUNRPC: Convert xprt receive queue to use an rbtreeTrond Myklebust1-2/+2
If the server is slow, we can find ourselves with quite a lot of entries on the receive queue. Converting the search from an O(n) to O(log(n)) can make a significant difference, particularly since we have to hold a number of locks while searching. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-09-30SUNRPC: Cleanup: remove the unused 'task' argument from the request_send()Trond Myklebust1-1/+1
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-09-30SUNRPC: Clean up transport write space handlingTrond Myklebust1-2/+3
Treat socket write space handling in the same way we now treat transport congestion: by denying the XPRT_LOCK until the transport signals that it has free buffer space. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-09-30SUNRPC: Turn off throttling of RPC slots for TCP socketsTrond Myklebust1-1/+0
The theory was that we would need to grab the socket lock anyway, so we might as well use it to gate the allocation of RPC slots for a TCP socket. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-09-30SUNRPC: Support for congestion control when queuing is enabledTrond Myklebust1-0/+2
Both RDMA and UDP transports require the request to get a "congestion control" credit before they can be transmitted. Right now, this is done when the request locks the socket. We'd like it to happen when a request attempts to be transmitted for the first time. In order to support retransmission of requests that already hold such credits, we also want to ensure that they get queued first, so that we don't deadlock with requests that have yet to obtain a credit. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-09-30SUNRPC: Improve latency for interactive tasksTrond Myklebust1-0/+1
One of the intentions with the priority queues was to ensure that no single process can hog the transport. The field task->tk_owner therefore identifies the RPC call's origin, and is intended to allow the RPC layer to organise queues for fairness. This commit therefore modifies the transmit queue to group requests by task->tk_owner, and ensures that we round robin among those groups. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-09-30SUNRPC: Treat the task and request as separate in the xprt_ops->send_request()Trond Myklebust1-1/+1
When we shift to using the transmit queue, then the task that holds the write lock will not necessarily be the same as the one being transmitted. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-09-30SUNRPC: Refactor RPC call encodingTrond Myklebust1-0/+1
Move the call encoding so that it occurs before the transport connection etc. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-09-30SUNRPC: Add a transmission queue for RPC requestsTrond Myklebust1-0/+6
Add the queue that will enforce the ordering of RPC task transmission. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>