summaryrefslogtreecommitdiff
path: root/fs/nfs
AgeCommit message (Collapse)AuthorFilesLines
2017-06-17nfs: Fix "Don't increment lock sequence ID after NFS4ERR_MOVED"Chuck Lever1-0/+1
[ Upstream commit 406dab8450ec76eca88a1af2fc15d18a2b36ca49 ] Lock sequence IDs are bumped in decode_lock by calling nfs_increment_seqid(). nfs_increment_sequid() does not use the seqid_mutating_err() function fixed in commit 059aa7348241 ("Don't increment lock sequence ID after NFS4ERR_MOVED"). Fixes: 059aa7348241 ("Don't increment lock sequence ID after ...") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Xuan Qi <xuan.qi@oracle.com> Cc: stable@vger.kernel.org # v3.7+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-14NFSv4: Don't perform cached access checks before we've OPENed the fileTrond Myklebust1-0/+3
commit 762674f86d0328d5dc923c966e209e1ee59663f2 upstream. Donald Buczek reports that a nfs4 client incorrectly denies execute access based on outdated file mode (missing 'x' bit). After the mode on the server is 'fixed' (chmod +x) further execution attempts continue to fail, because the nfs ACCESS call updates the access parameter but not the mode parameter or the mode in the inode. The root cause is ultimately that the VFS is calling may_open() before the NFS client has a chance to OPEN the file and hence revalidate the access and attribute caches. Al Viro suggests: >>> Make nfs_permission() relax the checks when it sees MAY_OPEN, if you know >>> that things will be caught by server anyway? >> >> That can work as long as we're guaranteed that everything that calls >> inode_permission() with MAY_OPEN on a regular file will also follow up >> with a vfs_open() or dentry_open() on success. Is this always the >> case? > > 1) in do_tmpfile(), followed by do_dentry_open() (not reachable by NFS since > it doesn't have ->tmpfile() instance anyway) > > 2) in atomic_open(), after the call of ->atomic_open() has succeeded. > > 3) in do_last(), followed on success by vfs_open() > > That's all. All calls of inode_permission() that get MAY_OPEN come from > may_open(), and there's no other callers of that puppy. Reported-by: Donald Buczek <buczek@molgen.mpg.de> Link: https://bugzilla.kernel.org/show_bug.cgi?id=109771 Link: http://lkml.kernel.org/r/1451046656-26319-1-git-send-email-buczek@molgen.mpg.de Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-14NFS: Ensure we revalidate attributes before using execute_ok()Trond Myklebust1-2/+16
commit 5c5fc09a1157a11dbe84e6421c3e0b37d05238cb upstream. Donald Buczek reports that NFS clients can also report incorrect results for access() due to lack of revalidation of attributes before calling execute_ok(). Looking closely, it seems chdir() is afflicted with the same problem. Fix is to ensure we call nfs_revalidate_inode_rcu() or nfs_revalidate_inode() as appropriate before deciding to trust execute_ok(). Reported-by: Donald Buczek <buczek@molgen.mpg.de> Link: http://lkml.kernel.org/r/1451331530-3748-1-git-send-email-buczek@molgen.mpg.de Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-12NFSv4: fix getacl ERANGE for some ACL buffer sizesWeston Andros Adamson1-6/+2
commit ed92d8c137b7794c2c2aa14479298b9885967607 upstream. We're not taking into account that the space needed for the (variable length) attr bitmap, with the result that we'd sometimes get a spurious ERANGE when the ACL data got close to the end of a page. Just add in an extra page to make sure. Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-12NFSv4: fix getacl head length estimationJ. Bruce Fields1-1/+1
commit 6682c14bbe505a8b912c57faf544f866777ee48d upstream. Bitmap and attrlen follow immediately after the op reply header. This was an oversight from commit bf118a342f. Consequences of this are just minor efficiency (extra calls to xdr_shrink_bufhead). Fixes: bf118a342f10 "NFSv4: include bitmap in nfsv4 get acl data" Reviewed-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-12NFSv4: Fix memory and state leak in _nfs4_open_and_get_stateTrond Myklebust1-1/+1
commit a974deee477af89411e0f80456bfb344ac433c98 upstream. If we exit because the file access check failed, we currently leak the struct nfs4_state. We need to attach it to the open context before returning. Fixes: 3efb9722475e ("NFSv4: Refactor _nfs4_open_and_get_state..") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-01NFSv4.0: always send mode in SETATTR after EXCLUSIVE4Benjamin Coddington1-1/+2
commit a430607b2ef7c3be090f88c71cfcb1b3988aa7c0 upstream. Some nfsv4.0 servers may return a mode for the verifier following an open with EXCLUSIVE4 createmode, but this does not mean the client should skip setting the mode in the following SETATTR. It should only do that for EXCLUSIVE4_1 or UNGAURDED createmode. Fixes: 5334c5bdac92 ("NFS: Send attributes in OPEN request for NFS4_CREATE_EXCLUSIVE4_1") Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-19NFSv4.1: nfs4_fl_prepare_ds must be careful about reporting success.NeilBrown1-1/+2
commit cfd278c280f997cf2fe4662e0acab0fe465f637b upstream. Various places assume that if nfs4_fl_prepare_ds() turns a non-NULL 'ds', then ds->ds_clp will also be non-NULL. This is not necessasrily true in the case when the process received a fatal signal while nfs4_pnfs_ds_connect is waiting in nfs4_wait_ds_connect(). In that case ->ds_clp may not be set, and the devid may not recently have been marked unavailable. So add a test for ds_clp == NULL and return NULL in that case. Fixes: c23266d532b4 ("NFS4.1 Fix data server connection race") Signed-off-by: NeilBrown <neilb@suse.com> Acked-by: Olga Kornievskaia <aglo@umich.edu> Acked-by: Adamson, Andy <William.Adamson@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-19NFS: Fix a performance regression in readdirTrond Myklebust1-13/+2
commit 79f687a3de9e3ba2518b4ea33f38ca6cbe9133eb upstream. Ben Coddington reports that commit 311324ad1713, by adding the function nfs_dir_mapping_need_revalidate() that checks page cache validity on each call to nfs_readdir() causes a performance regression when the directory is being modified. If the directory is changing while we're iterating through the directory, POSIX does not require us to invalidate the page cache unless the user calls rewinddir(). However, we still do want to ensure that we use readdirplus in order to avoid a load of stat() calls when the user is doing an 'ls -l' workload. The fix should be to invalidate the page cache immediately when we're setting the NFS_INO_ADVISE_RDPLUS bit. Reported-by: Benjamin Coddington <bcodding@redhat.com> Fixes: 311324ad1713 ("NFS: Be more aggressive in using readdirplus...") Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-19pNFS: Fix race in pnfs_wait_on_layoutreturnTrond Myklebust1-5/+3
commit ee284e35d8c71bf5d4d807eaff6f67a17134b359 upstream. We must put the task to sleep while holding the inode->i_lock in order to ensure atomicity with the test for NFS_LAYOUT_RETURN. Fixes: 500d701f336b ("NFS41: make close wait for layoutreturn") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-09nfs_write_end(): fix handling of short copiesAl Viro1-1/+1
commit c0cf3ef5e0f47e385920450b245d22bead93e7ad upstream. What matters when deciding if we should make a page uptodate is not how much we _wanted_ to copy, but how much we actually have copied. As it is, on architectures that do not zero tail on short copy we can leave uninitialized data in page marked uptodate. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-12-02NFSv4.x: hide array-bounds warningArnd Bergmann1-1/+1
commit d55b352b01bc78fbc3d1bb650140668b87e58bf9 upstream. A correct bugfix introduced a harmless warning that shows up with gcc-7: fs/nfs/callback.c: In function 'nfs_callback_up': fs/nfs/callback.c:214:14: error: array subscript is outside array bounds [-Werror=array-bounds] What happens here is that the 'minorversion == 0' check tells the compiler that we assume minorversion can be something other than 0, but when CONFIG_NFS_V4_1 is disabled that would be invalid and result in an out-of-bounds access. The added check for IS_ENABLED(CONFIG_NFS_V4_1) tells gcc that this really can't happen, which makes the code slightly smaller and also avoids the warning. The bugfix that introduced the warning is marked for stable backports, we want this one backported to the same releases. Fixes: 98b0f80c2396 ("NFSv4.x: Fix a refcount leak in nfs_callback_up_net") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28NFSv4.2: Fix a reference leak in nfs42_proc_layoutstats_genericJeff Layton1-0/+1
commit 3f807e5ae5597bd65a6fff684083e8eaa21f3fa7 upstream. The caller of rpc_run_task also gets a reference that must be put. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28NFSv4: Open state recovery must account for file permission changesTrond Myklebust1-0/+3
commit 304020fe48c6c7fff8b5a38f382b54404f0f79d3 upstream. If the file permissions change on the server, then we may not be able to recover open state. If so, we need to ensure that we mark the file descriptor appropriately. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Tested-by: Oleg Drokin <green@linuxhacker.ru> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28NFSv4: nfs4_copy_delegation_stateid() must fail if the delegation is invalidTrond Myklebust1-4/+13
commit aa05c87f23efe417adc7ff9b4193b7201ec0dd79 upstream. We must not allow the use of delegations that have been revoked or are being returned. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Fixes: 869f9dfa4d6d ("NFSv4: Fix races between nfs_remove_bad_delegation()...") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Tested-by: Oleg Drokin <green@linuxhacker.ru> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28NFSv4: Don't report revoked delegations as valid in nfs_have_delegation()Trond Myklebust1-0/+1
commit b3f9e7239074613aa6bdafa4caf7c104fe1e7276 upstream. If the delegation is revoked, then it can't be used for caching. Fixes: 869f9dfa4d6d ("NFSv4: Fix races between nfs_remove_bad_delegation()...") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Tested-by: Oleg Drokin <green@linuxhacker.ru> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-07pNFS/flexfiles: Fix layoutcommit after a commit to DSTrond Myklebust1-2/+1
commit c001c87a63aa2f35358e33eb05e45e4cbcb34f54 upstream. We should always do a layoutcommit after commit to DS, except if the layout segment we're using has set FF_FLAGS_NO_LAYOUTCOMMIT. Fixes: d67ae825a59d ("pnfs/flexfiles: Add the FlexFile Layout Driver") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-07pNFS/files: Fix layoutcommit after a commit to DSTrond Myklebust1-2/+1
commit 73e6c5d854d3f7f75e8b46d3e54aeb5d83fe6b1f upstream. According to the errata https://www.rfc-editor.org/errata_search.php?rfc=5661&eid=2751 we should always send layout commit after a commit to DS. Fixes: bc7d4b8fd091 ("nfs/filelayout: set layoutcommit...") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-07NFS: Don't drop CB requests with invalid principalsChuck Lever1-1/+5
commit a4e187d83d88eeaba6252aac0a2ffe5eaa73a818 upstream. Before commit 778be232a207 ("NFS do not find client in NFSv4 pg_authenticate"), the Linux callback server replied with RPC_AUTH_ERROR / RPC_AUTH_BADCRED, instead of dropping the CB request. Let's restore that behavior so the server has a chance to do something useful about it, and provide a warning that helps admins correct the problem. Fixes: 778be232a207 ("NFS do not find client in NFSv4 ...") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-24NFSv4.1: Fix the CREATE_SESSION slot number accountingTrond Myklebust1-2/+10
commit b519d408ea32040b1c7e10b155a3ee9a36660947 upstream. Ensure that we conform to the algorithm described in RFC5661, section 18.36.4 for when to bump the sequence id. In essence we do it for all cases except when the RPC call timed out, or in case of the server returning NFS4ERR_DELAY or NFS4ERR_STALE_CLIENTID. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-24pNFS: Ensure LAYOUTGET and LAYOUTRETURN are properly serialisedTrond Myklebust1-0/+3
commit bf0291dd2267a2b9a4cd74d65249553d11bb45d6 upstream. According to RFC5661, the client is responsible for serialising LAYOUTGET and LAYOUTRETURN to avoid ambiguity. Consider the case where we send both in parallel. Client Server ====== ====== LAYOUTGET(seqid=X) LAYOUTRETURN(seqid=X) LAYOUTGET return seqid=X+1 LAYOUTRETURN return seqid=X+2 Process LAYOUTRETURN Forget layout stateid Process LAYOUTGET Set seqid=X+1 The client processes the layoutget/layoutreturn in the wrong order, and since the result of the layoutreturn was to clear the only existing layout segment, the client forgets the layout stateid. When the LAYOUTGET comes in, it is treated as having a completely new stateid, and so the client sets the wrong sequence id... Fix is to check if there are outstanding LAYOUTGET requests before we send the LAYOUTRETURN (note that LAYOUGET will already wait if it sees an outstanding LAYOUTRETURN). Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-24NFSv4.x: Fix a refcount leak in nfs_callback_up_netTrond Myklebust1-0/+1
commit 98b0f80c2396224bbbed81792b526e6c72ba9efa upstream. On error, the callers expect us to return without bumping nn->cb_users[]. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-24pNFS: The client must not do I/O to the DS if it's lease has expiredTrond Myklebust1-0/+1
commit b88fa69eaa8649f11828158c7b65c4bcd886ebd5 upstream. Ensure that the client conforms to the normative behaviour described in RFC5661 Section 12.7.2: "If a client believes its lease has expired, it MUST NOT send I/O to the storage device until it has validated its lease." So ensure that we wait for the lease to be validated before using the layout. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-20nfs: don't create zero-length requestsBenjamin Coddington1-1/+4
commit 149a4fddd0a72d526abbeac0c8deaab03559836a upstream. NFS doesn't expect requests with wb_bytes set to zero and may make unexpected decisions about how to handle that request at the page IO layer. Skip request creation if we won't have any wb_bytes in the request. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reviewed-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-07-27NFS: Fix another OPEN_DOWNGRADE bugTrond Myklebust1-3/+2
commit e547f2628327fec6afd2e03b46f113f614cca05b upstream. Olga Kornievskaia reports that the following test fails to trigger an OPEN_DOWNGRADE on the wire, and only triggers the final CLOSE. fd0 = open(foo, RDRW) -- should be open on the wire for "both" fd1 = open(foo, RDONLY) -- should be open on the wire for "read" close(fd0) -- should trigger an open_downgrade read(fd1) close(fd1) The issue is that we're missing a check for whether or not the current state transitioned from an O_RDWR state as opposed to having transitioned from a combination of O_RDONLY and O_WRONLY. Reported-by: Olga Kornievskaia <aglo@umich.edu> Fixes: cd9288ffaea4 ("NFSv4: Fix another bug in the close/open_downgrade code") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-07-27make nfs_atomic_open() call d_drop() on all ->open_context() errors.Al Viro1-1/+1
commit d20cb71dbf3487f24549ede1a8e2d67579b4632e upstream. In "NFSv4: Move dentry instantiation into the NFSv4-specific atomic open code" unconditional d_drop() after the ->open_context() had been removed. It had been correct for success cases (there ->open_context() itself had been doing dcache manipulations), but not for error ones. Only one of those (ENOENT) got a compensatory d_drop() added in that commit, but in fact it should've been done for all errors. As it is, the case of O_CREAT non-exclusive open on a hashed negative dentry racing with e.g. symlink creation from another client ended up with ->open_context() getting an error and proceeding to call nfs_lookup(). On a hashed dentry, which would've instantly triggered BUG_ON() in d_materialise_unique() (or, these days, its equivalent in d_splice_alias()). Tested-by: Oleg Drokin <green@linuxhacker.ru> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-20nfs: use file_dentry()Miklos Szeredi3-6/+6
commit be62a1a8fd116f5cd9e53726601f970e16e17558 upstream. NFS may be used as lower layer of overlayfs and accessing f_path.dentry can lead to a crash. Fix by replacing direct access of file->f_path.dentry with the file_dentry() accessor, which will always return a native object. Fixes: 4bacc9c9234c ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay") Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Tested-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Acked-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-03-04NFSv4: Fix a dentry leak on alias useBenjamin Coddington1-2/+2
commit d9dfd8d741683347ee159d25f5b50c346a0df557 upstream. In the case where d_add_unique() finds an appropriate alias to use it will have already incremented the reference count. An additional dget() to swap the open context's dentry is unnecessary and will leak a reference. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Fixes: 275bb307865a3 ("NFSv4: Move dentry instantiation into the NFSv4-...") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-02-25pNFS/flexfiles: Fix an XDR encoding bug in layoutreturnTrond Myklebust1-4/+2
commit 082fa37d1351a41afc491d44a1d095cb8d919aa2 upstream. We must not skip encoding the statistics, or the server will see an XDR encoding error. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-02-25nfs: Fix race in __update_open_stateid()Andrew Elble1-1/+1
commit 361cad3c89070aeb37560860ea8bfc092d545adc upstream. We've seen this in a packet capture - I've intermixed what I think was going on. The fix here is to grab the so_lock sooner. 1964379 -> #1 open (for write) reply seqid=1 1964393 -> #2 open (for read) reply seqid=2 __nfs4_close(), state->n_wronly-- nfs4_state_set_mode_locked(), changes state->state = [R] state->flags is [RW] state->state is [R], state->n_wronly == 0, state->n_rdonly == 1 1964398 -> #3 open (for write) call -> because close is already running 1964399 -> downgrade (to read) call seqid=2 (close of #1) 1964402 -> #3 open (for write) reply seqid=3 __update_open_stateid() nfs_set_open_stateid_locked(), changes state->flags state->flags is [RW] state->state is [R], state->n_wronly == 0, state->n_rdonly == 1 new sequence number is exposed now via nfs4_stateid_copy() next step would be update_open_stateflags(), pending so_lock 1964403 -> downgrade reply seqid=2, fails with OLD_STATEID (close of #1) nfs4_close_prepare() gets so_lock and recalcs flags -> send close 1964405 -> downgrade (to read) call seqid=3 (close of #1 retry) __update_open_stateid() gets so_lock * update_open_stateflags() updates state->n_wronly. nfs4_state_set_mode_locked() updates state->state state->flags is [RW] state->state is [RW], state->n_wronly == 1, state->n_rdonly == 1 * should have suppressed the preceding nfs4_close_prepare() from sending open_downgrade 1964406 -> write call 1964408 -> downgrade (to read) reply seqid=4 (close of #1 retry) nfs_clear_open_stateid_locked() state->flags is [R] state->state is [RW], state->n_wronly == 1, state->n_rdonly == 1 1964409 -> write reply (fails, openmode) Signed-off-by: Andrew Elble <aweits@rit.edu> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-02-25pNFS/flexfiles: Fix an Oopsable typo in ff_mirror_match_fh()Trond Myklebust1-1/+1
commit 86fb449b07b8215443a30782dca5755d5b8b0577 upstream. Jeff reports seeing an Oops in ff_layout_alloc_lseg. Turns out copy+paste has played cruel tricks on a nested loop. Reported-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-02-25NFS: Fix attribute cache revalidationTrond Myklebust1-15/+39
commit ade14a7df796d4e86bd9d181193c883a57b13db0 upstream. If a NFSv4 client uses the cache_consistency_bitmask in order to request only information about the change attribute, timestamps and size, then it has not revalidated all attributes, and hence the attribute timeout timestamp should not be updated. Reported-by: Donald Buczek <buczek@molgen.mpg.de> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-02-17NFSv4.1/pnfs: Fixup an lo->plh_block_lgets imbalance in layoutreturnTrond Myklebust1-1/+0
commit 1a093ceb053832c25b92f3cf26b957543c7baf9b upstream. Since commit 2d8ae84fbc32, nothing is bumping lo->plh_block_lgets in the layoutreturn path, so it should not be touched in nfs4_layoutreturn_release either. Fixes: 2d8ae84fbc32 ("NFSv4.1/pnfs: Remove redundant lo->plh_block_lgets...") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-12-14sched/wait: Fix the signal handling fixPeter Zijlstra4-7/+7
Jan Stancek reported that I wrecked things for him by fixing things for Vladimir :/ His report was due to an UNINTERRUPTIBLE wait getting -EINTR, which should not be possible, however my previous patch made this possible by unconditionally checking signal_pending(). We cannot use current->state as was done previously, because the instruction after the store to that variable it can be changed. We must instead pass the initial state along and use that. Fixes: 68985633bccb ("sched/wait: Fix signal handling in bit wait helpers") Reported-by: Jan Stancek <jstancek@redhat.com> Reported-by: Chris Mason <clm@fb.com> Tested-by: Jan Stancek <jstancek@redhat.com> Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> Tested-by: Chris Mason <clm@fb.com> Reviewed-by: Paul Turner <pjt@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: tglx@linutronix.de Cc: Oleg Nesterov <oleg@redhat.com> Cc: hpa@zytor.com Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-13Merge tag 'nfs-for-4.4-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds1-5/+2
Pull NFS client bugfix from Trond Myklebust: "SUNRPC: Fix a NFSv4.1 callback channel regression" * tag 'nfs-for-4.4-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: SUNRPC: Fix callback channel
2015-12-12osd fs: __r4w_get_page rely on PageUptodate for uptodateHugh Dickins1-4/+1
Commit 42cb14b110a5 ("mm: migrate dirty page without clear_page_dirty_for_io etc") simplified the migration of a PageDirty pagecache page: one stat needs moving from zone to zone and that's about all. It's convenient and safest for it to shift the PageDirty bit from old page to new, just before updating the zone stats: before copying data and marking the new PageUptodate. This is all done while both pages are isolated and locked, just as before; and just as before, there's a moment when the new page is visible in the radix_tree, but not yet PageUptodate. What's new is that it may now be briefly visible as PageDirty before it is PageUptodate. When I scoured the tree to see if this could cause a problem anywhere, the only places I found were in two similar functions __r4w_get_page(): which look up a page with find_get_page() (not using page lock), then claim it's uptodate if it's PageDirty or PageWriteback or PageUptodate. I'm not sure whether that was right before, but now it might be wrong (on rare occasions): only claim the page is uptodate if PageUptodate. Or perhaps the page in question could never be migratable anyway? Signed-off-by: Hugh Dickins <hughd@google.com> Tested-by: Boaz Harrosh <ooo@electrozaur.com> Cc: Benny Halevy <bhalevy@panasas.com> Cc: Trond Myklebust <trond.myklebust@primarydata.com> Cc: Christoph Lameter <cl@linux.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-08SUNRPC: Fix callback channelTrond Myklebust1-5/+2
The NFSv4.1 callback channel is currently broken because the receive message will keep shrinking because the backchannel receive buffer size never gets reset. The easiest solution to this problem is instead of changing the receive buffer, to rather adjust the copied request. Fixes: 38b7631fbe42 ("nfs4: limit callback decoding to received bytes") Cc: Benjamin Coddington <bcodding@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-11-25nfs4: resend LAYOUTGET when there is a race that changes the seqidJeff Layton1-25/+31
pnfs_layout_process will check the returned layout stateid against what the kernel has in-core. If it turns out that the stateid we received is older, then we should resend the LAYOUTGET instead of falling back to MDS I/O. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Cc: stable@vger.kernel.org # 3.18+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-11-25nfs: if we have no valid attrs, then don't declare the attribute cache validJeff Layton1-1/+5
If we pass in an empty nfs_fattr struct to nfs_update_inode, it will (correctly) not update any of the attributes, but it then clears the NFS_INO_INVALID_ATTR flag, which indicates that the attributes are up to date. Don't clear the flag if the fattr struct has no valid attrs to apply. Reviewed-by: Steve French <steve.french@primarydata.com> Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-11-25nfs: ensure that attrcache is revalidated after a SETATTRJeff Layton1-1/+4
If we get no post-op attributes back from a SETATTR operation, then no attributes will of course be updated during the call to nfs_update_inode. We know however that the attributes are invalid at that point, since we just changed some of them. At the very least, the ctime will be bogus. If we get no post-op attributes back on the call, mark the attrcache invalid to reflect that fact. Reviewed-by: Steve French <steve.french@primarydata.com> Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-11-24nfs4: limit callback decoding to received bytesBenjamin Coddington1-2/+5
A truncated cb_compound request will cause the client to decode null or data from a previous callback for nfs4.1 backchannel case, or uninitialized data for the nfs4.0 case. This is because the path through svc_process_common() advances the request's iov_base and decrements iov_len without adjusting the overall xdr_buf's len field. That causes xdr_init_decode() to set up the xdr_stream with an incorrect length in nfs4_callback_compound(). Fixing this for the nfs4.1 backchannel case first requires setting the correct iov_len and page_len based on the length of received data in the same manner as the nfs4.0 case. Then the request's xdr_buf length can be adjusted for both cases based upon the remaining iov_len and page_len. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-11-24nfs4: start callback_ident at idr 1Benjamin Coddington1-1/+1
If clp->cl_cb_ident is zero, then nfs_cb_idr_remove_locked() skips removing it when the nfs_client is freed. A decoding or server bug can then find and try to put that first nfs_client which would lead to a crash. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Fixes: d6870312659d ("nfs4client: convert to idr_alloc()") Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-11-24nfs: use sliding delay when LAYOUTGET gets NFS4ERR_DELAYJeff Layton1-1/+1
When LAYOUTGET gets NFS4ERR_DELAY, we currently will wait 15s before retrying the call. That is a _very_ long time, so add a timeout value to struct nfs4_layoutget and pass nfs4_async_handle_error a pointer to it. This allows the RPC engine to use a sliding delay window, instead of a 15s delay. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-11-24NFS4: Cleanup FATTR4_WORD0_FS_LOCATIONS after decoding successKinglong Mee1-0/+1
Commit 1ca843a2d2 "nfs: Fix GETATTR bitmap verification" has check the bitmap after decoding success, but decode_attr_fs_locations forgets cleanup the FATTR4_WORD0_FS_LOCATIONS bits. decode_getfattr_attrs always return -EIO when meeting FS_LOCATIONS now. ls: cannot access /mnt/referal: Input/output error ls: cannot access /mnt/replicas: Input/output error total 32 drwxr-xr-x. 7 root root 8192 Nov 16 20:36 pnfs ??????????? ? ? ? ? ? referal ??????????? ? ? ? ? ? replicas v2: clear the bit earlier Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-11-24NFS: Properly set NFS v4.2 NFSDBG_FACILITYAnna Schumaker1-1/+1
NFS v4.2 operations can work outside of pNFS, so dprintk() output shouldn't be placed under NFSDBG_PNFS. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-11-24nfs: reduce the amount of ifdefs for v4.2 in nfs4file.cChristoph Hellwig1-20/+7
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-11-24nfs: use btrfs ioctl defintions for cloneChristoph Hellwig1-4/+6
The NFS CLONE_RANGE defintion was wrong and thus never worked. Fix this by simply using the btrfs ioctl defintion. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-11-24nfs: allow intra-file CLONEChristoph Hellwig1-5/+14
Originally CLONE didn't allow for intra-file clones, but we recently updated the spec to support this feature which is also supported by local Linux file systems. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-11-24nfs: offer native ioctls even if CONFIG_COMPAT is setChristoph Hellwig1-3/+0
Without this for example 64-bit binaries on typical amd64 distributions would not be able to use ioctls on NFS. For now this only affects clones. Additionally ->compat_ioctl is defined even for non-compat builds, so get rid of the pointless ifdef. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-11-24nfs: pass on count for CLONE operationsChristoph Hellwig1-0/+1
Currently we pass uninitialized stack garbage in the count parameter. The value is usually large enought to clone whole files and thus let simple tests pass, but it makes the tests for range clones very unhappy. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>