summaryrefslogtreecommitdiff
path: root/fs/cifs/file.c
AgeCommit message (Collapse)AuthorFilesLines
2019-01-30CIFS: Fix possible oops and memory leaks in async IOPavel Shilovsky1-3/+8
Allocation of a page array for non-cached IO was separated from allocation of rdata and wdata structures and this introduced memory leaks and a possible null pointer dereference. This patch fixes these problems. Cc: <stable@vger.kernel.org> Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-01-11CIFS: Fix error paths in writeback codePavel Shilovsky1-6/+23
This patch aims to address writeback code problems related to error paths. In particular it respects EINTR and related error codes and stores and returns the first error occurred during writeback. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Acked-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-01-11cifs: Fix potential OOB access of lock element arrayRoss Lagerwall1-4/+4
If maxBuf is small but non-zero, it could result in a zero sized lock element array which we would then try and access OOB. Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Signed-off-by: Steve French <stfrench@microsoft.com> CC: Stable <stable@vger.kernel.org>
2019-01-11cifs: Limit memory used by lock request calls to a pageRoss Lagerwall1-0/+8
The code tries to allocate a contiguous buffer with a size supplied by the server (maxBuf). This could fail if memory is fragmented since it results in high order allocations for commonly used server implementations. It is also wasteful since there are probably few locks in the usual case. Limit the buffer to be no larger than a page to avoid memory allocation failures due to fragmentation. Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-01-05fs: don't open code lru_to_page()Nikolay Borisov1-1/+2
Multiple filesystems open code lru_to_page(). Rectify this by moving the macro from mm_inline (which is specific to lru stuff) to the more generic mm.h header and start using the macro where appropriate. No functional changes. Link: http://lkml.kernel.org/r/20181129104810.23361-1-nborisov@suse.com Link: https://lkml.kernel.org/r/20181129075301.29087-1-nborisov@suse.com Signed-off-by: Nikolay Borisov <nborisov@suse.com> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: Pankaj gupta <pagupta@redhat.com> Acked-by: "Yan, Zheng" <zyan@redhat.com> [ceph] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-02Merge tag '4.21-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds1-2/+10
Pull cifs updates from Steve French: - four fixes for stable - improvements to DFS including allowing failover to alternate targets - some small performance improvements * tag '4.21-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6: (39 commits) cifs: update internal module version number cifs: we can not use small padding iovs together with encryption cifs: Minor Kconfig clarification cifs: Always resolve hostname before reconnecting cifs: Add support for failover in cifs_reconnect_tcon() cifs: Add support for failover in smb2_reconnect() cifs: Only free DFS target list if we actually got one cifs: start DFS cache refresher in cifs_mount() cifs: Use GFP_ATOMIC when a lock is held in cifs_mount() cifs: Add support for failover in cifs_reconnect() cifs: Add support for failover in cifs_mount() cifs: remove set but not used variable 'sep' cifs: Make use of DFS cache to get new DFS referrals cifs: minor updates to documentation cifs: check kzalloc return cifs: remove set but not used variable 'server' cifs: Use kzfree() to free password cifs: Fix to use kmem_cache_free() instead of kfree() cifs: update for current_kernel_time64() removal cifs: Add DFS cache routines ...
2018-12-28Merge tag 'locks-v4.21-1' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux Pull file locking updates from Jeff Layton: "The main change in this set is Neil Brown's work to reduce the thundering herd problem when a heavily-contended file lock is released. Previously we'd always wake up all waiters when this occurred. With this set, we'll now we only wake up waiters that were blocked on the range being released" * tag 'locks-v4.21-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux: locks: Use inode_is_open_for_write fs/locks: remove unnecessary white space. fs/locks: merge posix_unblock_lock() and locks_delete_block() fs/locks: create a tree of dependent requests. fs/locks: change all *_conflict() functions to return bool. fs/locks: always delete_block after waiting. fs/locks: allow a lock request to block other requests. fs/locks: use properly initialized file_lock when unlocking. ocfs2: properly initial file_lock used for unlock. gfs2: properly initial file_lock used for unlock. NFS: use locks_copy_lock() to copy locks. fs/locks: split out __locks_wake_up_blocks(). fs/locks: rename some lists and pointers.
2018-12-24CIFS: return correct errors when pinning memory failed for direct I/OLong Li1-1/+7
When pinning memory failed, we should return the correct error code and rewind the SMB credits. Reported-by: Murphy Zhou <jencce.kernel@gmail.com> Signed-off-by: Long Li <longli@microsoft.com> Cc: stable@vger.kernel.org Cc: Murphy Zhou <jencce.kernel@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2018-12-24CIFS: use the correct length when pinning memory for direct I/O for writeLong Li1-1/+3
The current code attempts to pin memory using the largest possible wsize based on the currect SMB credits. This doesn't cause kernel oops but this is not optimal as we may pin more pages then actually needed. Fix this by only pinning what are needed for doing this write I/O. Signed-off-by: Long Li <longli@microsoft.com> Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Joey Pabalinas <joeypabalinas@gmail.com>
2018-12-07fs/locks: merge posix_unblock_lock() and locks_delete_block()NeilBrown1-1/+1
posix_unblock_lock() is not specific to posix locks, and behaves nearly identically to locks_delete_block() - the former returning a status while the later doesn't. So discard posix_unblock_lock() and use locks_delete_block() instead, after giving that function an appropriate return value. Signed-off-by: NeilBrown <neilb@suse.com> Reviewed-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Jeff Layton <jlayton@kernel.org>
2018-12-07CIFS: Avoid returning EBUSY to upper layer VFSLong Li1-25/+6
EBUSY is not handled by VFS, and will be passed to user-mode. This is not correct as we need to wait for more credits. This patch also fixes a bug where rsize or wsize is used uninitialized when the call to server->ops->wait_mtu_credits() fails. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2018-11-30fs/locks: rename some lists and pointers.NeilBrown1-1/+1
struct file lock contains an 'fl_next' pointer which is used to point to the lock that this request is blocked waiting for. So rename it to fl_blocker. The fl_blocked list_head in an active lock is the head of a list of blocked requests. In a request it is a node in that list. These are two distinct uses, so replace with two list_heads with different names. fl_blocked_requests is the head of a list of blocked requests fl_blocked_member is a node in a member of that list. The two different list_heads are never used at the same time, but that will change in a future patch. Note that a tracepoint is changed to report fl_blocker instead of fl_next. Signed-off-by: NeilBrown <neilb@suse.com> Reviewed-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Jeff Layton <jlayton@kernel.org>
2018-11-02cifs: fix signed/unsigned mismatch on aio_read patchSteve French1-6/+11
The patch "CIFS: Add support for direct I/O read" had a signed/unsigned mismatch (ssize_t vs. size_t) in the return from one function. Similar trivial change in aio_write Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reported-by: Julia Lawall <julia.lawall@lip6.fr>
2018-11-02CIFS: Add support for direct I/O writeLong Li1-41/+163
With direct I/O write, user supplied buffers are pinned to the memory and data are transferred directly from user buffers to the transport layer. Change in v3: add support for kernel AIO Change in v4: Refactor common write code to __cifs_writev for direct and non-direct I/O. Retry on direct I/O failure. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2018-11-02CIFS: Add support for direct I/O readLong Li1-39/+186
With direct I/O read, we transfer the data directly from transport layer to the user data buffer. Change in v3: add support for kernel AIO Change in v4: Refactor common read code to __cifs_readv for direct and non-direct I/O. Retry on direct I/O failure. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2018-11-02cifs: fix spelling mistake, EACCESS -> EACCESColin Ian King1-1/+1
Trivial fix to a spelling mistake of the error access name EACCESS, rename to EACCES Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2018-11-02Merge branch 'work.afs' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull AFS updates from Al Viro: "AFS series, with some iov_iter bits included" * 'work.afs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (26 commits) missing bits of "iov_iter: Separate type from direction and use accessor functions" afs: Probe multiple fileservers simultaneously afs: Fix callback handling afs: Eliminate the address pointer from the address list cursor afs: Allow dumping of server cursor on operation failure afs: Implement YFS support in the fs client afs: Expand data structure fields to support YFS afs: Get the target vnode in afs_rmdir() and get a callback on it afs: Calc callback expiry in op reply delivery afs: Fix FS.FetchStatus delivery from updating wrong vnode afs: Implement the YFS cache manager service afs: Remove callback details from afs_callback_break struct afs: Commit the status on a new file/dir/symlink afs: Increase to 64-bit volume ID and 96-bit vnode ID for YFS afs: Don't invoke the server to read data beyond EOF afs: Add a couple of tracepoints to log I/O errors afs: Handle EIO from delivery function afs: Fix TTL on VL server and address lists afs: Implement VL server rotation afs: Improve FS server rotation error handling ...
2018-10-24smb3: show number of current open files in /proc/fs/cifs/StatsSteve French1-0/+2
To allow better debugging (for example applications with handle leaks, or complex reconnect scenarios) display the number of open files (on the client) and number of open server file handles for each tcon in /proc/fs/cifs/Stats. Note that open files on server is one larger than local due to handle caching (in this case of the root of the share). In this example there are two local open files, and three (two file and one directory handle) open on the server. Sample output: $ cat /proc/fs/cifs/Stats Resources in use CIFS Session: 1 Share (unique mount targets): 2 SMB Request/Response Buffer: 1 Pool size: 5 SMB Small Req/Resp Buffer: 1 Pool size: 30 Operations (MIDs): 0 0 session 0 share reconnects Total vfs operations: 36 maximum at one time: 2 1) \\localhost\test SMBs: 69 Bytes read: 27 Bytes written: 0 Open files: 2 total (local), 3 open on server TreeConnects: 1 total 0 failed TreeDisconnects: 0 total 0 failed Creates: 19 total 0 failed Closes: 16 total 0 failed ... Signed-off-by: Steve French <stfrench@microsoft.com>
2018-10-24cifs: track writepages in vfs operation countersSteve French1-1/+10
writepages and readpages operations did not call get/free_xid so the statistics for file copy could get confusing with "vfs operations" not increasing. Add get_xid and free_xid to cifs readpages and writepages functions. Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2018-10-24cifs: OFD locks do not conflict with eachothersRonnie Sahlberg1-13/+22
RHBZ 1484130 Update cifs_find_fid_lock_conflict() to recognize that ODF locks do not conflict with eachother. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2018-10-24cifs: do not return atime less than mtimeSteve French1-2/+6
In network file system it is fairly easy for server and client atime vs. mtime to get confused (and atime updated less frequently) which we noticed broke some apps which expect atime >= mtime Also ignore relatime mount option (rather than error on it) since relatime is basically what some network server fs are doing (relatime). Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2018-10-24iov_iter: Use accessor functionDavid Howells1-2/+2
Use accessor functions to access an iterator's type and direction. This allows for the possibility of using some other method of determining the type of iterator than if-chains with bitwise-AND conditions. Signed-off-by: David Howells <dhowells@redhat.com>
2018-06-13treewide: kzalloc() -> kcalloc()Kees Cook1-1/+1
The kzalloc() function has a 2-factor argument form, kcalloc(). This patch replaces cases of: kzalloc(a * b, gfp) with: kcalloc(a * b, gfp) as well as handling cases of: kzalloc(a * b * c, gfp) with: kzalloc(array3_size(a, b, c), gfp) as it's slightly less ugly than: kzalloc_array(array_size(a, b), c, gfp) This does, however, attempt to ignore constant size factors like: kzalloc(4 * 1024, gfp) though any constants defined via macros get caught up in the conversion. Any factors with a sizeof() of "unsigned char", "char", and "u8" were dropped, since they're redundant. The Coccinelle script used for this was: // Fix redundant parens around sizeof(). @@ type TYPE; expression THING, E; @@ ( kzalloc( - (sizeof(TYPE)) * E + sizeof(TYPE) * E , ...) | kzalloc( - (sizeof(THING)) * E + sizeof(THING) * E , ...) ) // Drop single-byte sizes and redundant parens. @@ expression COUNT; typedef u8; typedef __u8; @@ ( kzalloc( - sizeof(u8) * (COUNT) + COUNT , ...) | kzalloc( - sizeof(__u8) * (COUNT) + COUNT , ...) | kzalloc( - sizeof(char) * (COUNT) + COUNT , ...) | kzalloc( - sizeof(unsigned char) * (COUNT) + COUNT , ...) | kzalloc( - sizeof(u8) * COUNT + COUNT , ...) | kzalloc( - sizeof(__u8) * COUNT + COUNT , ...) | kzalloc( - sizeof(char) * COUNT + COUNT , ...) | kzalloc( - sizeof(unsigned char) * COUNT + COUNT , ...) ) // 2-factor product with sizeof(type/expression) and identifier or constant. @@ type TYPE; expression THING; identifier COUNT_ID; constant COUNT_CONST; @@ ( - kzalloc + kcalloc ( - sizeof(TYPE) * (COUNT_ID) + COUNT_ID, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * COUNT_ID + COUNT_ID, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * (COUNT_CONST) + COUNT_CONST, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * COUNT_CONST + COUNT_CONST, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * (COUNT_ID) + COUNT_ID, sizeof(THING) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * COUNT_ID + COUNT_ID, sizeof(THING) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * (COUNT_CONST) + COUNT_CONST, sizeof(THING) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * COUNT_CONST + COUNT_CONST, sizeof(THING) , ...) ) // 2-factor product, only identifiers. @@ identifier SIZE, COUNT; @@ - kzalloc + kcalloc ( - SIZE * COUNT + COUNT, SIZE , ...) // 3-factor product with 1 sizeof(type) or sizeof(expression), with // redundant parens removed. @@ expression THING; identifier STRIDE, COUNT; type TYPE; @@ ( kzalloc( - sizeof(TYPE) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kzalloc( - sizeof(TYPE) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kzalloc( - sizeof(TYPE) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kzalloc( - sizeof(TYPE) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kzalloc( - sizeof(THING) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kzalloc( - sizeof(THING) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kzalloc( - sizeof(THING) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kzalloc( - sizeof(THING) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) ) // 3-factor product with 2 sizeof(variable), with redundant parens removed. @@ expression THING1, THING2; identifier COUNT; type TYPE1, TYPE2; @@ ( kzalloc( - sizeof(TYPE1) * sizeof(TYPE2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | kzalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | kzalloc( - sizeof(THING1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | kzalloc( - sizeof(THING1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | kzalloc( - sizeof(TYPE1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) | kzalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) ) // 3-factor product, only identifiers, with redundant parens removed. @@ identifier STRIDE, SIZE, COUNT; @@ ( kzalloc( - (COUNT) * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - COUNT * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - COUNT * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - (COUNT) * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - COUNT * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - (COUNT) * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - (COUNT) * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - COUNT * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) ) // Any remaining multi-factor products, first at least 3-factor products, // when they're not all constants... @@ expression E1, E2, E3; constant C1, C2, C3; @@ ( kzalloc(C1 * C2 * C3, ...) | kzalloc( - (E1) * E2 * E3 + array3_size(E1, E2, E3) , ...) | kzalloc( - (E1) * (E2) * E3 + array3_size(E1, E2, E3) , ...) | kzalloc( - (E1) * (E2) * (E3) + array3_size(E1, E2, E3) , ...) | kzalloc( - E1 * E2 * E3 + array3_size(E1, E2, E3) , ...) ) // And then all remaining 2 factors products when they're not all constants, // keeping sizeof() as the second factor argument. @@ expression THING, E1, E2; type TYPE; constant C1, C2, C3; @@ ( kzalloc(sizeof(THING) * C2, ...) | kzalloc(sizeof(TYPE) * C2, ...) | kzalloc(C1 * C2 * C3, ...) | kzalloc(C1 * C2, ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * (E2) + E2, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * E2 + E2, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * (E2) + E2, sizeof(THING) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * E2 + E2, sizeof(THING) , ...) | - kzalloc + kcalloc ( - (E1) * E2 + E1, E2 , ...) | - kzalloc + kcalloc ( - (E1) * (E2) + E1, E2 , ...) | - kzalloc + kcalloc ( - E1 * E2 + E1, E2 , ...) ) Signed-off-by: Kees Cook <keescook@chromium.org>
2018-06-03CIFS: Use offset when reading pagesLong Li1-15/+37
With offset defined in rdata, transport functions need to look at this offset when reading data into the correct places in pages. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com>
2018-06-03CIFS: Add support for direct pages in rdataLong Li1-3/+20
Add a function to allocate rdata without allocating pages for data transfer. This gives the caller an option to pass a number of pages that point to the data buffer. rdata is still reponsible for free those pages after it's done. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com>
2018-04-17fs: cifs: Adding new return type vm_fault_tSouptick Joarder1-1/+1
Use new return type vm_fault_t for page_mkwrite handler. Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2018-04-11page cache: use xa_lockMatthew Wilcox1-5/+4
Remove the address_space ->tree_lock and use the xa_lock newly added to the radix_tree_root. Rename the address_space ->page_tree to ->i_pages, since we don't really care that it's a tree. [willy@infradead.org: fix nds32, fs/dax.c] Link: http://lkml.kernel.org/r/20180406145415.GB20605@bombadil.infradead.orgLink: http://lkml.kernel.org/r/20180313132639.17387-9-willy@infradead.org Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Acked-by: Jeff Layton <jlayton@redhat.com> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-01-25CIFS: SMBD: Upper layer performs SMB read via RDMA write through memory ↵Long Li1-2/+15
registration If I/O size is larger than rdma_readwrite_threshold, use RDMA write for SMB read by specifying channel SMB2_CHANNEL_RDMA_V1 or SMB2_CHANNEL_RDMA_V1_INVALIDATE in the SMB packet, depending on SMB dialect used. Append a smbd_buffer_descriptor_v1 to the end of the SMB packet and fill in other values to indicate this SMB read uses RDMA write. There is no need to read from the transport for incoming payload. At the time SMB read response comes back, the data is already transferred and placed in the pages by RDMA hardware. When SMB read is finished, deregister the memory regions if RDMA write is used for this SMB read. smbd_deregister_mr may need to do local invalidation and sleep, if server remote invalidation is not used. There are situations where the MID may not be created on I/O failure, under which memory region is deregistered when read data context is released. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2018-01-25cifs: Fix missing put_xid in cifs_file_strict_mmapMatthew Wilcox1-14/+12
If cifs_zap_mapping() returned an error, we would return without putting the xid that we got earlier. Restructure cifs_file_strict_mmap() and cifs_file_mmap() to be more similar to each other and have a single point of return that always puts the xid. Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com> CC: Stable <stable@vger.kernel.org>
2017-11-16cifs: use find_get_pages_range_tag()Jan Kara1-19/+2
wdata_alloc_and_fillpages() needlessly iterates calls to find_get_pages_tag(). Also it wants only pages from given range. Make it use find_get_pages_range_tag(). Link: http://lkml.kernel.org/r/20171009151359.31984-17-jack@suse.cz Signed-off-by: Jan Kara <jack@suse.cz> Suggested-by: Daniel Jordan <daniel.m.jordan@oracle.com> Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Steve French <sfrench@samba.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-23SMB3: Don't ignore O_SYNC/O_DSYNC and O_DIRECT flagsSteve French1-0/+7
Signed-off-by: Steve French <smfrench@gmail.com> CC: Stable <stable@vger.kernel.org> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2017-09-21CIFS: make arrays static const, reduces object code sizeColin Ian King1-4/+8
Don't populate the read-only arrays types[] on the stack, instead make them both static const. Makes the object code smaller by over 200 bytes: Before: text data bss dec hex filename 111503 37696 448 149647 2488f fs/cifs/file.o After: text data bss dec hex filename 111140 37856 448 149444 247c4 fs/cifs/file.o Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2017-08-01fs: convert a pile of fsync routines to errseq_t based reportingJeff Layton1-2/+2
This patch converts most of the in-kernel filesystems that do writeback out of the pagecache to report errors using the errseq_t-based infrastructure that was recently added. This allows them to report errors once for each open file description. Most filesystems have a fairly straightforward fsync operation. They call filemap_write_and_wait_range to write back all of the data and wait on it, and then (sometimes) sync out the metadata. For those filesystems this is a straightforward conversion from calling filemap_write_and_wait_range in their fsync operation to calling file_write_and_wait_range. Acked-by: Jan Kara <jack@suse.cz> Acked-by: Dave Kleikamp <dave.kleikamp@oracle.com> Signed-off-by: Jeff Layton <jlayton@redhat.com>
2017-07-06CIFS: fix circular locking dependencyRabin Vincent1-2/+2
When a CIFS filesystem is mounted with the forcemand option and the following command is run on it, lockdep warns about a circular locking dependency between CifsInodeInfo::lock_sem and the inode lock. while echo foo > hello; do :; done & while touch -c hello; do :; done cifs_writev() takes the locks in the wrong order, but note that we can't only flip the order around because it releases the inode lock before the call to generic_write_sync() while it holds the lock_sem across that call. But, AFAICS, there is no need to hold the CifsInodeInfo::lock_sem across the generic_write_sync() call either, so we can release both the locks before generic_write_sync(), and change the order. ====================================================== WARNING: possible circular locking dependency detected 4.12.0-rc7+ #9 Not tainted ------------------------------------------------------ touch/487 is trying to acquire lock: (&cifsi->lock_sem){++++..}, at: cifsFileInfo_put+0x88f/0x16a0 but task is already holding lock: (&sb->s_type->i_mutex_key#11){+.+.+.}, at: utimes_common+0x3ad/0x870 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&sb->s_type->i_mutex_key#11){+.+.+.}: __lock_acquire+0x1f74/0x38f0 lock_acquire+0x1cc/0x600 down_write+0x74/0x110 cifs_strict_writev+0x3cb/0x8c0 __vfs_write+0x4c1/0x930 vfs_write+0x14c/0x2d0 SyS_write+0xf7/0x240 entry_SYSCALL_64_fastpath+0x1f/0xbe -> #0 (&cifsi->lock_sem){++++..}: check_prevs_add+0xfa0/0x1d10 __lock_acquire+0x1f74/0x38f0 lock_acquire+0x1cc/0x600 down_write+0x74/0x110 cifsFileInfo_put+0x88f/0x16a0 cifs_setattr+0x992/0x1680 notify_change+0x61a/0xa80 utimes_common+0x3d4/0x870 do_utimes+0x1c1/0x220 SyS_utimensat+0x84/0x1a0 entry_SYSCALL_64_fastpath+0x1f/0xbe other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&sb->s_type->i_mutex_key#11); lock(&cifsi->lock_sem); lock(&sb->s_type->i_mutex_key#11); lock(&cifsi->lock_sem); *** DEADLOCK *** 2 locks held by touch/487: #0: (sb_writers#10){.+.+.+}, at: mnt_want_write+0x41/0xb0 #1: (&sb->s_type->i_mutex_key#11){+.+.+.}, at: utimes_common+0x3ad/0x870 stack backtrace: CPU: 0 PID: 487 Comm: touch Not tainted 4.12.0-rc7+ #9 Call Trace: dump_stack+0xdb/0x185 print_circular_bug+0x45b/0x790 __lock_acquire+0x1f74/0x38f0 lock_acquire+0x1cc/0x600 down_write+0x74/0x110 cifsFileInfo_put+0x88f/0x16a0 cifs_setattr+0x992/0x1680 notify_change+0x61a/0xa80 utimes_common+0x3d4/0x870 do_utimes+0x1c1/0x220 SyS_utimensat+0x84/0x1a0 entry_SYSCALL_64_fastpath+0x1f/0xbe Fixes: 19dfc1f5f2ef03a52 ("cifs: fix the race in cifs_writev()") Signed-off-by: Rabin Vincent <rabinv@axis.com> Signed-off-by: Steve French <smfrench@gmail.com> Acked-by: Pavel Shilovsky <pshilov@microsoft.com>
2017-07-06cifs: set mapping error when page writeback fails in writepage or launder_pagesJeff Layton1-5/+7
Signed-off-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Steve French <smfrench@gmail.com>
2017-06-21CIFS: Set ->should_dirty in cifs_user_readv()Dan Carpenter1-1/+1
The current code causes a static checker warning because ITER_IOVEC is zero so the condition is never true. Fixes: 6685c5e2d1ac ("CIFS: Add asynchronous read support through kernel AIO") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Steve French <smfrench@gmail.com>
2017-05-10CIFS: silence lockdep splat in cifs_relock_file()Rabin Vincent1-1/+1
cifs_relock_file() can perform a down_write() on the inode's lock_sem even though it was already performed in cifs_strict_readv(). Lockdep complains about this. AFAICS, there is no problem here, and lockdep just needs to be told that this nesting is OK. ============================================= [ INFO: possible recursive locking detected ] 4.11.0+ #20 Not tainted --------------------------------------------- cat/701 is trying to acquire lock: (&cifsi->lock_sem){++++.+}, at: cifs_reopen_file+0x7a7/0xc00 but task is already holding lock: (&cifsi->lock_sem){++++.+}, at: cifs_strict_readv+0x177/0x310 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&cifsi->lock_sem); lock(&cifsi->lock_sem); *** DEADLOCK *** May be due to missing lock nesting notation 1 lock held by cat/701: #0: (&cifsi->lock_sem){++++.+}, at: cifs_strict_readv+0x177/0x310 stack backtrace: CPU: 0 PID: 701 Comm: cat Not tainted 4.11.0+ #20 Call Trace: dump_stack+0x85/0xc2 __lock_acquire+0x17dd/0x2260 ? trace_hardirqs_on_thunk+0x1a/0x1c ? preempt_schedule_irq+0x6b/0x80 lock_acquire+0xcc/0x260 ? lock_acquire+0xcc/0x260 ? cifs_reopen_file+0x7a7/0xc00 down_read+0x2d/0x70 ? cifs_reopen_file+0x7a7/0xc00 cifs_reopen_file+0x7a7/0xc00 ? printk+0x43/0x4b cifs_readpage_worker+0x327/0x8a0 cifs_readpage+0x8c/0x2a0 generic_file_read_iter+0x692/0xd00 cifs_strict_readv+0x29f/0x310 generic_file_splice_read+0x11c/0x1c0 do_splice_to+0xa5/0xc0 splice_direct_to_actor+0xfa/0x350 ? generic_pipe_buf_nosteal+0x10/0x10 do_splice_direct+0xb5/0xe0 do_sendfile+0x278/0x3a0 SyS_sendfile64+0xc4/0xe0 entry_SYSCALL_64_fastpath+0x1f/0xbe Signed-off-by: Rabin Vincent <rabinv@axis.com> Acked-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com>
2017-05-02CIFS: Add asynchronous write support through kernel AIOPavel Shilovsky1-51/+137
This patch adds support to process write calls passed by io_submit() asynchronously. It based on the previously introduced async context that allows to process i/o responses in a separate thread and return the caller immediately for asynchronous calls. This improves writing performance of single threaded applications with increasing of i/o queue depth size. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com>
2017-05-02CIFS: Add asynchronous read support through kernel AIOPavel Shilovsky1-39/+130
This patch adds support to process read calls passed by io_submit() asynchronously. It based on the previously introduced async context that allows to process i/o responses in a separate thread and return the caller immediately for asynchronous calls. This improves reading performance of single threaded applications with increasing of i/o queue depth size. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com>
2017-04-11CIFS: store results of cifs_reopen_file to avoid infinite waitGermano Percossi1-3/+3
This fixes Continuous Availability when errors during file reopen are encountered. cifs_user_readv and cifs_user_writev would wait for ever if results of cifs_reopen_file are not stored and for later inspection. In fact, results are checked and, in case of errors, a chain of function calls leading to reads and writes to be scheduled in a separate thread is skipped. These threads will wake up the corresponding waiters once reads and writes are done. However, given the return value is not stored, when rc is checked for errors a previous one (always zero) is inspected instead. This leads to pending reads/writes added to the list, making cifs_user_readv and cifs_user_writev wait for ever. Signed-off-by: Germano Percossi <germano.percossi@citrix.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> CC: Stable <stable@vger.kernel.org> Signed-off-by: Steve French <smfrench@gmail.com>
2017-02-25mm, fs: reduce fault, page_mkwrite, and pfn_mkwrite to take only vmfDave Jiang1-1/+1
->fault(), ->page_mkwrite(), and ->pfn_mkwrite() calls do not need to take a vma and vmf parameter when the vma already resides in vmf. Remove the vma parameter to simplify things. [arnd@arndb.de: fix ARM build] Link: http://lkml.kernel.org/r/20170125223558.1451224-1-arnd@arndb.de Link: http://lkml.kernel.org/r/148521301778.19116.10840599906674778980.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-02CIFS: Add copy into pages callback for a read operationPavel Shilovsky1-6/+46
Since we have two different types of reads (pagecache and direct) we need to process such responses differently after decryption of a packet. The change allows to specify a callback that copies a read payload data into preallocated pages. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
2017-02-02CIFS: Fix splice read for non-cached filesPavel Shilovsky1-1/+9
Currently we call copy_page_to_iter() for uncached reading into a pipe. This is wrong because it treats pages as VFS cache pages and copies references rather than actual data. When we are trying to read from the pipe we end up calling page_cache_pipe_buf_confirm() which returns -ENODATA. This error is translated into 0 which is returned to a user. This issue is reproduced by running xfs-tests suite (generic test #249) against mount points with "cache=none". Fix it by mapping pages manually and calling copy_to_iter() that copies data into the pipe. Cc: Stable <stable@vger.kernel.org> Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
2016-12-05CIFS: Fix a possible double locking of mutex during reconnectPavel Shilovsky1-1/+7
With the current code it is possible to lock a mutex twice when a subsequent reconnects are triggered. On the 1st reconnect we reconnect sessions and tcons and then persistent file handles. If the 2nd reconnect happens during the reconnecting of persistent file handles then the following sequence of calls is observed: cifs_reopen_file -> SMB2_open -> small_smb2_init -> smb2_reconnect -> cifs_reopen_persistent_file_handles -> cifs_reopen_file (again!). So, we are trying to acquire the same cfile->fh_mutex twice which is wrong. Fix this by moving reconnecting of persistent handles to the delayed work (smb2_reconnect_server) and submitting this work every time we reconnect tcon in SMB2 commands handling codepath. This can also lead to corruption of a temporary file list in cifs_reopen_persistent_file_handles() because we can recursively call this function twice. Cc: Stable <stable@vger.kernel.org> # v4.9+ Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
2016-10-14CIFS: Reset read oplock to NONE if we have mandatory locks after reopenPavel Shilovsky1-0/+9
We are already doing the same thing for an ordinary open case: we can't keep read oplock on a file if we have mandatory byte-range locks because pagereading can conflict with these locks on a server. Fix it by setting oplock level to NONE. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com>
2016-10-14CIFS: Fix persistent handles re-opening on reconnectPavel Shilovsky1-5/+17
openFileList of tcon can be changed while cifs_reopen_file() is called that can lead to an unexpected behavior when we return to the loop. Fix this by introducing a temp list for keeping all file handles that need to be reopen. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com>
2016-10-12fs/cifs: reopen persistent handles on reconnectSteve French1-0/+18
Continuous Availability features like persistent handles require that clients reconnect their open files, not just the sessions, soon after the network connection comes back up, otherwise the server will throw away the state (byte range locks, leases, deny modes) on those handles after a timeout. Add code to reconnect handles when use_persistent set (e.g. Continuous Availability shares) after tree reconnect. Signed-off-by: Aurelien Aptel <aaptel@suse.com> Reviewed-by: Germano Percossi <germano.percossi@citrix.com> Signed-off-by: Steve French <smfrench@gmail.com>
2016-10-12Clarify locking of cifs file and tcon structures and make more granularSteve French1-27/+39
Remove the global file_list_lock to simplify cifs/smb3 locking and have spinlocks that more closely match the information they are protecting. Add new tcon->open_file_lock and file->file_info_lock spinlocks. Locks continue to follow a heirachy, cifs_socket --> cifs_ses --> cifs_tcon --> cifs_file where global tcp_ses_lock still protects socket and cifs_ses, while the the newer locks protect the lower level structure's information (tcon and cifs_file respectively). CC: Stable <stable@vger.kernel.org> Signed-off-by: Steve French <steve.french@primarydata.com> Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Reviewed-by: Aurelien Aptel <aaptel@suse.com> Reviewed-by: Germano Percossi <germano.percossi@citrix.com>
2016-10-11Merge branch 'for-linus' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull more vfs updates from Al Viro: ">rename2() work from Miklos + current_time() from Deepa" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fs: Replace current_fs_time() with current_time() fs: Replace CURRENT_TIME_SEC with current_time() for inode timestamps fs: Replace CURRENT_TIME with current_time() for inode timestamps fs: proc: Delete inode time initializations in proc_alloc_inode() vfs: Add current_time() api vfs: add note about i_op->rename changes to porting fs: rename "rename2" i_op to "rename" vfs: remove unused i_op->rename fs: make remaining filesystems use .rename2 libfs: support RENAME_NOREPLACE in simple_rename() fs: support RENAME_NOREPLACE for local filesystems ncpfs: fix unused variable warning
2016-09-28fs: Replace current_fs_time() with current_time()Deepa Dinamani1-2/+2
current_fs_time() uses struct super_block* as an argument. As per Linus's suggestion, this is changed to take struct inode* as a parameter instead. This is because the function is primarily meant for vfs inode timestamps. Also the function was renamed as per Arnd's suggestion. Change all calls to current_fs_time() to use the new current_time() function instead. current_fs_time() will be deleted. Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>