summaryrefslogtreecommitdiff
path: root/fs/nfsd
AgeCommit message (Collapse)AuthorFilesLines
2019-02-20nfsd4: fix crash on writing v4_end_grace before nfsd startupJ. Bruce Fields1-0/+2
[ Upstream commit 62a063b8e7d1db684db3f207261a466fa3194e72 ] Anatoly Trosinenko reports that this: 1) Checkout fresh master Linux branch (tested with commit e195ca6cb) 2) Copy x84_64-config-4.14 to .config, then enable NFS server v4 and build 3) From `kvm-xfstests shell`: results in NULL dereference in locks_end_grace. Check that nfsd has been started before trying to end the grace period. Reported-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-10-13nfsd: fix corrupted reply to badly ordered compoundJ. Bruce Fields1-0/+1
[ Upstream commit 5b7b15aee641904ae269be9846610a3950cbd64c ] We're encoding a single op in the reply but leaving the number of ops zero, so the reply makes no sense. Somewhat academic as this isn't a case any real client will hit, though in theory perhaps that could change in a future protocol extension. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03nfsd: restrict rd_maxcount to svc_max_payload in nfsd_encode_readdirScott Mayhew1-2/+3
commit 9c2ece6ef67e9d376f32823086169b489c422ed0 upstream. nfsd4_readdir_rsize restricts rd_maxcount to svc_max_payload when estimating the size of the readdir reply, but nfsd_encode_readdir restricts it to INT_MAX when encoding the reply. This can result in log messages like "kernel: RPC request reserved 32896 but used 1049444". Restrict rd_dircount similarly (no reason it should be larger than svc_max_payload). Signed-off-by: Scott Mayhew <smayhew@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-07nfsd: check for use of the closed special stateidAndrew Elble1-2/+5
[ Upstream commit ae254dac721d44c0bfebe2795df87459e2e88219 ] Prevent the use of the closed (invalid) special stateid by clients. Signed-off-by: Andrew Elble <aweits@rit.edu> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-07nfsd: CLOSE SHOULD return the invalid special stateid for NFSv4.x (x>0)Trond Myklebust1-0/+8
[ Upstream commit fb500a7cfee7f2f447d2bbf30cb59629feab6ac1 ] Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-20NFSD: fix nfsd_reset_versions for NFSv4.NeilBrown1-15/+12
[ Upstream commit 800a938f0bf9130c8256116649c0cc5806bfb2fd ] If you write "-2 -3 -4" to the "versions" file, it will notice that no versions are enabled, and nfsd_reset_versions() is called. This enables all major versions, not no minor versions. So we lose the invariant that NFSv4 is only advertised when at least one minor is enabled. Fix the code to explicitly enable minor versions for v4, change it to use nfsd_vers() to test and set, and simplify the code. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-20NFSD: fix nfsd_minorversion(.., NFSD_AVAIL)NeilBrown1-1/+2
[ Upstream commit 928c6fb3a9bfd6c5b287aa3465226add551c13c0 ] Current code will return 1 if the version is supported, and -1 if it isn't. This is confusing and inconsistent with the one place where this is used. So change to return 1 if it is supported, and zero if not. i.e. an error is never returned. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30nfsd: deal with revoked delegations appropriatelyAndrew Elble1-1/+24
commit 95da1b3a5aded124dd1bda1e3cdb876184813140 upstream. If a delegation has been revoked by the server, operations using that delegation should error out with NFS4ERR_DELEG_REVOKED in the >4.1 case, and NFS4ERR_BAD_STATEID otherwise. The server needs NFSv4.1 clients to explicitly free revoked delegations. If the server returns NFS4ERR_DELEG_REVOKED, the client will do that; otherwise it may just forget about the delegation and be unable to recover when it later sees SEQ4_STATUS_RECALLABLE_STATE_REVOKED set on a SEQUENCE reply. That can cause the Linux 4.1 client to loop in its stage manager. Signed-off-by: Andrew Elble <aweits@rit.edu> Reviewed-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-08-30nfsd: Limit end of page list when decoding NFSv4 WRITEChuck Lever1-4/+2
commit fc788f64f1f3eb31e87d4f53bcf1ab76590d5838 upstream. When processing an NFSv4 WRITE operation, argp->end should never point past the end of the data in the final page of the page list. Otherwise, nfsd4_decode_compound can walk into uninitialized memory. More critical, nfsd4_decode_write is failing to increment argp->pagelen when it increments argp->pagelist. This can cause later xdr decoders to assume more data is available than really is, which can cause server crashes on malformed requests. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-14nfsd4: fix null dereference on replayJ. Bruce Fields1-7/+6
commit 9a307403d374b993061f5992a6e260c944920d0b upstream. if we receive a compound such that: - the sessionid, slot, and sequence number in the SEQUENCE op match a cached succesful reply with N ops, and - the Nth operation of the compound is a PUTFH, PUTPUBFH, PUTROOTFH, or RESTOREFH, then nfsd4_sequence will return 0 and set cstate->status to nfserr_replay_cache. The current filehandle will not be set. This will cause us to call check_nfsd_access with first argument NULL. To nfsd4_compound it looks like we just succesfully executed an operation that set a filehandle, but the current filehandle is not set. Fix this by moving the nfserr_replay_cache earlier. There was never any reason to have it after the encode_op label, since the only case where he hit that is when opdesc->op_func sets it. Note that there are two ways we could hit this case: - a client is resending a previously sent compound that ended with one of the four PUTFH-like operations, or - a client is sending a *new* compound that (incorrectly) shares sessionid, slot, and sequence number with a previously sent compound, and the length of the previously sent compound happens to match the position of a PUTFH-like operation in the new compound. The second is obviously incorrect client behavior. The first is also very strange--the only purpose of a PUTFH-like operation is to set the current filehandle to be used by the following operation, so there's no point in having it as the last in a compound. So it's likely this requires a buggy or malicious client to reproduce. Reported-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-08nfsd: check for oversized NFSv2/v3 argumentsJ. Bruce Fields1-0/+36
commit e6838a29ecb484c97e4efef9429643b9851fba6e upstream. A client can append random data to the end of an NFSv2 or NFSv3 RPC call without our complaining; we'll just stop parsing at the end of the expected data and ignore the rest. Encoded arguments and replies are stored together in an array of pages, and if a call is too large it could leave inadequate space for the reply. This is normally OK because NFS RPC's typically have either short arguments and long replies (like READ) or long arguments and short replies (like WRITE). But a client that sends an incorrectly long reply can violate those assumptions. This was observed to cause crashes. Also, several operations increment rq_next_page in the decode routine before checking the argument size, which can leave rq_next_page pointing well past the end of the page array, causing trouble later in svc_free_pages. So, following a suggestion from Neil Brown, add a central check to enforce our expectation that no NFSv2/v3 call has both a large call and a large reply. As followup we may also want to rewrite the encoding routines to check more carefully that they aren't running off the end of the page array. We may also consider rejecting calls that have any extra garbage appended. That would be safer, and within our rights by spec, but given the age of our server and the NFS protocol, and the fact that we've never enforced this before, we may need to balance that against the possibility of breaking some oddball client. Reported-by: Tuomas Haanpää <thaan@synopsys.com> Reported-by: Ari Kauppi <ari@synopsys.com> Reviewed-by: NeilBrown <neilb@suse.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-30nfsd: work around a gcc-5.1 warningArnd Bergmann1-2/+1
commit 6ac75368e1a658903cf57b2bbf66e60d34f55558 upstream. gcc-5.0 warns about a potential uninitialized variable use in nfsd: fs/nfsd/nfs4state.c: In function 'nfsd4_process_open2': fs/nfsd/nfs4state.c:3781:3: warning: 'old_deny_bmap' may be used uninitialized in this function [-Wmaybe-uninitialized] reset_union_bmap_deny(old_deny_bmap, stp); ^ fs/nfsd/nfs4state.c:3760:16: note: 'old_deny_bmap' was declared here unsigned char old_deny_bmap; ^ This is a false positive, the code path that is warned about cannot actually be reached. This adds an initialization for the variable to make the warning go away. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-12-23nfsd: Disable NFSv2 timestamp workaround for NFSv3+Andreas Gruenbacher2-38/+50
[ Upstream commit cc265089ce1b176dde963c74b53593446ee7f99a ] NFSv2 can set the atime and/or mtime of a file to specific timestamps but not to the server's current time. To implement the equivalent of utimes("file", NULL), it uses a heuristic. NFSv3 and later do support setting the atime and/or mtime to the server's current time directly. The NFSv2 heuristic is still enabled, and causes timestamps to be set wrong sometimes. Fix this by moving the heuristic into the NFSv2 specific code. We can leave it out of the create code path: the owner can always set timestamps arbitrarily, and the workaround would never trigger. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
2016-08-22nfsd: check permissions when setting ACLsBen Hutchings3-27/+25
[ Upstream commit 999653786df6954a31044528ac3f7a5dadca08f4 ] Use set_posix_acl, which includes proper permission checks, instead of calling ->set_acl directly. Without this anyone may be able to grant themselves permissions to a file by setting the ACL. Lock the inode to make the new checks atomic with respect to set_acl. (Also, nfsd was the only caller of set_acl not locking the inode, so I suspect this may fix other races.) This also simplifies the code, and ensures our ACLs are checked by posix_acl_valid. The permission checks and the inode locking were lost with commit 4ac7249e, which changed nfsd to use the set_acl inode operation directly instead of going through xattr handlers. Reported-by: David Sinquin <david@sinquin.eu> [agreunba@redhat.com: use set_posix_acl] Fixes: 4ac7249e Cc: Christoph Hellwig <hch@infradead.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
2016-07-12nfsd4/rpc: move backchannel create logic into rpc codeJ. Bruce Fields1-17/+1
[ Upstream commit d50039ea5ee63c589b0434baa5ecf6e5075bb6f9 ] Also simplify the logic a bit. Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com> Acked-by: Trond Myklebust <trondmy@primarydata.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2016-04-18nfsd: fix deadlock secinfo+readdir compoundJ. Bruce Fields1-0/+1
[ Upstream commit 2f6fc056e899bd0144a08da5cacaecbe8997cd74 ] nfsd_lookup_dentry exits with the parent filehandle locked. fh_put also unlocks if necessary (nfsd filehandle locking is probably too lenient), so it gets unlocked eventually, but if the following op in the compound needs to lock it again, we can deadlock. A fuzzer ran into this; normal clients don't send a secinfo followed by a readdir in the same compound. Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2016-04-18nfsd4: fix bad bounds checkingJ. Bruce Fields1-5/+8
[ Upstream commit 4aed9c46afb80164401143aa0fdcfe3798baa9d5 ] A number of spots in the xdr decoding follow a pattern like n = be32_to_cpup(p++); READ_BUF(n + 4); where n is a u32. The only bounds checking is done in READ_BUF itself, but since it's checking (n + 4), it won't catch cases where n is very large, (u32)(-4) or higher. I'm not sure exactly what the consequences are, but we've seen crashes soon after. Instead, just break these up into two READ_BUF()s. Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2016-01-21nfsd: serialize state seqid morphing operationsJeff Layton2-13/+37
[ Upstream commit 35a92fe8770ce54c5eb275cd76128645bea2d200 ] Andrew was seeing a race occur when an OPEN and OPEN_DOWNGRADE were running in parallel. The server would receive the OPEN_DOWNGRADE first and check its seqid, but then an OPEN would race in and bump it. The OPEN_DOWNGRADE would then complete and bump the seqid again. The result was that the OPEN_DOWNGRADE would be applied after the OPEN, even though it should have been rejected since the seqid changed. The only recourse we have here I think is to serialize operations that bump the seqid in a stateid, particularly when we're given a seqid in the call. To address this, we add a new rw_semaphore to the nfs4_ol_stateid struct. We do a down_write prior to checking the seqid after looking up the stateid to ensure that nothing else is going to bump it while we're operating on it. In the case of OPEN, we do a down_read, as the call doesn't contain a seqid. Those can run in parallel -- we just need to serialize them when there is a concurrent OPEN_DOWNGRADE or CLOSE. LOCK and LOCKU however always take the write lock as there is no opportunity for parallelizing those. Reported-and-Tested-by: Andrew W Elble <aweits@rit.edu> Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-08-27nfsd: do nfs4_check_fh in nfs4_check_file instead of nfs4_check_olstateidJeff Layton1-6/+6
[ Upstream commit 1ccdd6c6e9a342c2ed4ced38faa67303226a2a6a ] commit 8fcd461db7c09337b6d2e22d25eb411123f379e3 upstream. Currently, preprocess_stateid_op calls nfs4_check_olstateid which verifies that the open stateid corresponds to the current filehandle in the call by calling nfs4_check_fh. If the stateid is a NFS4_DELEG_STID however, then no such check is done. This could cause incorrect enforcement of permissions, because the nfsd_permission() call in nfs4_check_file uses current the current filehandle, but any subsequent IO operation will use the file descriptor in the stateid. Move the call to nfs4_check_fh into nfs4_check_file instead so that it can be done for all stateid types. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> [bfields: moved fh check to avoid NULL deref in special stateid case] Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-08-27nfsd: refactor nfs4_preprocess_stateid_opChristoph Hellwig1-45/+53
[ Upstream commit 3b5c2aed0e5557c6bc4a305e7627a16a764b4cdb ] commit a0649b2d3fffb1cde8745568c767f3a55a3462bc upstream. Split out two self contained helpers to make the function more readable. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Cc: Jeff Layton <jlayton@poochiereds.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-08-27nfsd: Drop BUG_ON and ignore SECLABEL on absent filesystemKinglong Mee1-5/+6
[ Upstream commit c7e6f05156402364f34669e0fa6fd69b834f994b ] commit c2227a39a078473115910512aa0f8d53bd915e60 upstream. On an absent filesystem (one served by another server), we need to be able to handle requests for certain attributest (like fs_locations, so the client can find out which server does have the filesystem), but others we can't. We forgot to take that into account when adding another attribute bitmask work for the SECURITY_LABEL attribute. There an export entry with the "refer" option can result in: [ 88.414272] kernel BUG at fs/nfsd/nfs4xdr.c:2249! [ 88.414828] invalid opcode: 0000 [#1] SMP [ 88.415368] Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache nfsd xfs libcrc32c iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi iosf_mbi ppdev btrfs coretemp crct10dif_pclmul crc32_pclmul crc32c_intel xor ghash_clmulni_intel raid6_pq vmw_balloon parport_pc parport i2c_piix4 shpchp vmw_vmci acpi_cpufreq auth_rpcgss nfs_acl lockd grace sunrpc vmwgfx drm_kms_helper ttm drm mptspi mptscsih serio_raw mptbase e1000 scsi_transport_spi ata_generic pata_acpi [last unloaded: nfsd] [ 88.417827] CPU: 0 PID: 2116 Comm: nfsd Not tainted 4.0.7-300.fc22.x86_64 #1 [ 88.418448] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/20/2014 [ 88.419093] task: ffff880079146d50 ti: ffff8800785d8000 task.ti: ffff8800785d8000 [ 88.419729] RIP: 0010:[<ffffffffa04b3c10>] [<ffffffffa04b3c10>] nfsd4_encode_fattr+0x820/0x1f00 [nfsd] [ 88.420376] RSP: 0000:ffff8800785db998 EFLAGS: 00010206 [ 88.421027] RAX: 0000000000000001 RBX: 000000000018091a RCX: ffff88006668b980 [ 88.421676] RDX: 00000000fffef7fc RSI: 0000000000000000 RDI: ffff880078d05000 [ 88.422315] RBP: ffff8800785dbb58 R08: ffff880078d043f8 R09: ffff880078d4a000 [ 88.422968] R10: 0000000000010000 R11: 0000000000000002 R12: 0000000000b0a23a [ 88.423612] R13: ffff880078d05000 R14: ffff880078683100 R15: ffff88006668b980 [ 88.424295] FS: 0000000000000000(0000) GS:ffff88007c600000(0000) knlGS:0000000000000000 [ 88.424944] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 88.425597] CR2: 00007f40bc370f90 CR3: 0000000035af5000 CR4: 00000000001407f0 [ 88.426285] Stack: [ 88.426921] ffff8800785dbaa8 ffffffffa049e4af ffff8800785dba08 ffffffff813298f0 [ 88.427585] ffff880078683300 ffff8800769b0de8 0000089d00000001 0000000087f805e0 [ 88.428228] ffff880000000000 ffff880079434a00 0000000000000000 ffff88006668b980 [ 88.428877] Call Trace: [ 88.429527] [<ffffffffa049e4af>] ? exp_get_by_name+0x7f/0xb0 [nfsd] [ 88.430168] [<ffffffff813298f0>] ? inode_doinit_with_dentry+0x210/0x6a0 [ 88.430807] [<ffffffff8123833e>] ? d_lookup+0x2e/0x60 [ 88.431449] [<ffffffff81236133>] ? dput+0x33/0x230 [ 88.432097] [<ffffffff8123f214>] ? mntput+0x24/0x40 [ 88.432719] [<ffffffff812272b2>] ? path_put+0x22/0x30 [ 88.433340] [<ffffffffa049ac87>] ? nfsd_cross_mnt+0xb7/0x1c0 [nfsd] [ 88.433954] [<ffffffffa04b54e0>] nfsd4_encode_dirent+0x1b0/0x3d0 [nfsd] [ 88.434601] [<ffffffffa04b5330>] ? nfsd4_encode_getattr+0x40/0x40 [nfsd] [ 88.435172] [<ffffffffa049c991>] nfsd_readdir+0x1c1/0x2a0 [nfsd] [ 88.435710] [<ffffffffa049a530>] ? nfsd_direct_splice_actor+0x20/0x20 [nfsd] [ 88.436447] [<ffffffffa04abf30>] nfsd4_encode_readdir+0x120/0x220 [nfsd] [ 88.437011] [<ffffffffa04b58cd>] nfsd4_encode_operation+0x7d/0x190 [nfsd] [ 88.437566] [<ffffffffa04aa6dd>] nfsd4_proc_compound+0x24d/0x6f0 [nfsd] [ 88.438157] [<ffffffffa0496103>] nfsd_dispatch+0xc3/0x220 [nfsd] [ 88.438680] [<ffffffffa006f0cb>] svc_process_common+0x43b/0x690 [sunrpc] [ 88.439192] [<ffffffffa0070493>] svc_process+0x103/0x1b0 [sunrpc] [ 88.439694] [<ffffffffa0495a57>] nfsd+0x117/0x190 [nfsd] [ 88.440194] [<ffffffffa0495940>] ? nfsd_destroy+0x90/0x90 [nfsd] [ 88.440697] [<ffffffff810bb728>] kthread+0xd8/0xf0 [ 88.441260] [<ffffffff810bb650>] ? kthread_worker_fn+0x180/0x180 [ 88.441762] [<ffffffff81789e58>] ret_from_fork+0x58/0x90 [ 88.442322] [<ffffffff810bb650>] ? kthread_worker_fn+0x180/0x180 [ 88.442879] Code: 0f 84 93 05 00 00 83 f8 ea c7 85 a0 fe ff ff 00 00 27 30 0f 84 ba fe ff ff 85 c0 0f 85 a5 fe ff ff e9 e3 f9 ff ff 0f 1f 44 00 00 <0f> 0b 66 0f 1f 44 00 00 be 04 00 00 00 4c 89 ef 4c 89 8d 68 fe [ 88.444052] RIP [<ffffffffa04b3c10>] nfsd4_encode_fattr+0x820/0x1f00 [nfsd] [ 88.444658] RSP <ffff8800785db998> [ 88.445232] ---[ end trace 6cb9d0487d94a29f ]--- Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-06-10nfsd: fix the check for confirmed openowner in nfs4_preprocess_stateid_opChristoph Hellwig1-10/+11
[ Upstream commit ebe9cb3bb13e7b9b281969cd279ce70834f7500f ] If we find a non-confirmed openowner we jump to exit the function, but do not set an error value. Fix this by factoring out a helper to do the check and properly set the error from nfsd4_validate_stateid. Cc: stable@vger.kernel.org Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-05-18nfsd4: disallow SEEK with special stateidsJ. Bruce Fields1-0/+2
[ Upstream commit 980608fb50aea34993ba956b71cd4602aa42b14b ] If the client uses a special stateid then we'll pass a NULL file to vfs_llseek. Fixes: 24bab491220f " NFSD: Implement SEEK" Cc: Anna Schumaker <Anna.Schumaker@Netapp.com> Cc: stable@vger.kernel.org Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-05-18nfsd4: fix READ permission checkingJ. Bruce Fields1-4/+8
[ Upstream commit 6e4891dc289cd191d46ab7ba1dcb29646644f9ca ] In the case we already have a struct file (derived from a stateid), we still need to do permission-checking; otherwise an unauthorized user could gain access to a file by sniffing or guessing somebody else's stateid. Cc: stable@vger.kernel.org Fixes: dc97618ddda9 "nfsd4: separate splice and readv cases" Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-04-23nfsd: return correct lockowner when there is a race on hash insertJ. Bruce Fields1-1/+1
[ Upstream commit 340f0ba1c6c8412aa35fd6476044836b84361ea6 ] alloc_init_lock_stateowner can return an already freed entry if there is a race to put openowners in the hashtable. Noticed by inspection after Jeff Layton fixed the same bug for open owners. Depending on client behavior, this one may be trickier to trigger in practice. Fixes: c58c6610ec24 "nfsd: Protect adding/removing lock owners using client_lock" Cc: <stable@vger.kernel.org> Cc: Trond Myklebust <trond.myklebust@primarydata.com> Acked-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-04-23nfsd: return correct openowner when there is a race to put one in the hashJeff Layton1-1/+1
[ Upstream commit c5952338bfc234e54deda45b7228f610a545e28a ] alloc_init_open_stateowner can return an already freed entry if there is a race to put openowners in the hashtable. In commit 7ffb588086e9, we changed it so that we allocate and initialize an openowner, and then check to see if a matching one got stuffed into the hashtable in the meantime. If it did, then we free the one we just allocated and take a reference on the one already there. There is a bug here though. The code will then return the pointer to the one that was allocated (and has now been freed). This wasn't evident before as this race almost never occurred. The Linux kernel client used to serialize requests for a single openowner. That has changed now with v4.0 kernels, and this race can now easily occur. Fixes: 7ffb588086e9 Cc: <stable@vger.kernel.org> # v3.17+ Cc: Trond Myklebust <trond.myklebust@primarydata.com> Reported-by: Christoph Hellwig <hch@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-03-24nfsd: fix clp->cl_revoked list deletion causing softlock in nfsdAndrew Elble1-1/+1
commit c876486be17aeefe0da569f3d111cbd8de6f675d upstream. commit 2d4a532d385f ("nfsd: ensure that clp->cl_revoked list is protected by clp->cl_lock") removed the use of the reaplist to clean out clp->cl_revoked. It failed to change list_entry() to walk clp->cl_revoked.next instead of reaplist.next Fixes: 2d4a532d385f ("nfsd: ensure that clp->cl_revoked list is protected by clp->cl_lock") Reported-by: Eric Meddaugh <etmsys@rit.edu> Tested-by: Eric Meddaugh <etmsys@rit.edu> Signed-off-by: Andrew Elble <aweits@rit.edu> Reviewed-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-01-16nfsd: fix fi_delegees leak when fi_had_conflict returns trueJeff Layton1-1/+1
commit 94ae1db226a5bcbb48372d81161f084c9e283fd8 upstream. Currently, nfs4_set_delegation takes a reference to an existing delegation and then checks to see if there is a conflict. If there is one, then it doesn't release that reference. Change the code to take the reference after the check and only if there is no conflict. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-01-16nfsd4: fix xdr4 count of server in fs_location4Benjamin Coddington1-1/+1
commit bf7491f1be5e125eece2ec67e0f79d513caa6c7e upstream. Fix a bug where nfsd4_encode_components_esc() incorrectly calculates the length of server array in fs_location4--note that it is a count of the number of array elements, not a length in bytes. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Fixes: 082d4bd72a45 (nfsd4: "backfill" using write_bytes_to_xdr_buf) Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-01-16nfsd4: fix xdr4 inclusion of escaped charBenjamin Coddington1-0/+3
commit 5a64e56976f1ba98743e1678c0029a98e9034c81 upstream. Fix a bug where nfsd4_encode_components_esc() includes the esc_end char as an additional string encoding. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Fixes: e7a0444aef4a "nfsd: add IPv6 addr escaping to fs_location hosts" Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-01-16fs: nfsd: Fix signedness bug in compare_blobRasmus Villemoes1-8/+7
commit ef17af2a817db97d42dd2ec0a425231748e23dbc upstream. Bugs similar to the one in acbbe6fbb240 (kcmp: fix standard comparison bug) are in rich supply. In this variant, the problem is that struct xdr_netobj::len has type unsigned int, so the expression o1->len - o2->len _also_ has type unsigned int; it has completely well-defined semantics, and the result is some non-negative integer, which is always representable in a long long. But this means that if the conditional triggers, we are guaranteed to return a positive value from compare_blob. In this case it could be fixed by - res = o1->len - o2->len; + res = (long long)o1->len - (long long)o2->len; but I'd rather eliminate the usually broken 'return a - b;' idiom. Reviewed-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-19nfsd: Fix slot wake up race in the nfsv4.1 callback codeTrond Myklebust1-2/+6
The currect code for nfsd41_cb_get_slot() and nfsd4_cb_done() has no locking in order to guarantee atomicity, and so allows for races of the form. Task 1 Task 2 ====== ====== if (test_and_set_bit(0) != 0) { clear_bit(0) rpc_wake_up_next(queue) rpc_sleep_on(queue) return false; } This patch breaks the race condition by adding a retest of the bit after the call to rpc_sleep_on(). Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-11-19nfsd: correctly define v4.2 support attributesChristoph Hellwig1-3/+6
Even when security labels are disabled we support at least the same attributes as v4.1. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-10-23nfsd4: fix crash on unknown operation numberJ. Bruce Fields1-1/+2
Unknown operation numbers are caught in nfsd4_decode_compound() which sets op->opnum to OP_ILLEGAL and op->status to nfserr_op_illegal. The error causes the main loop in nfsd4_proc_compound() to skip most processing. But nfsd4_proc_compound also peeks ahead at the next operation in one case and doesn't take similar precautions there. Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-10-21nfsd4: fix response size estimation for OP_SEQUENCEJ. Bruce Fields1-1/+3
We added this new estimator function but forgot to hook it up. The effect is that NFSv4.1 (and greater) won't do zero-copy reads. The estimate was also wrong by 8 bytes. Fixes: ccae70a9ee41 "nfsd4: estimate sequence response size" Cc: stable@vger.kernel.org Reported-by: Chuck Lever <chucklever@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-10-13Merge branch 'sched-core-for-linus' of ↵Linus Torvalds1-1/+0
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "The main changes in this cycle were: - Optimized support for Intel "Cluster-on-Die" (CoD) topologies (Dave Hansen) - Various sched/idle refinements for better idle handling (Nicolas Pitre, Daniel Lezcano, Chuansheng Liu, Vincent Guittot) - sched/numa updates and optimizations (Rik van Riel) - sysbench speedup (Vincent Guittot) - capacity calculation cleanups/refactoring (Vincent Guittot) - Various cleanups to thread group iteration (Oleg Nesterov) - Double-rq-lock removal optimization and various refactorings (Kirill Tkhai) - various sched/deadline fixes ... and lots of other changes" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (72 commits) sched/dl: Use dl_bw_of() under rcu_read_lock_sched() sched/fair: Delete resched_cpu() from idle_balance() sched, time: Fix build error with 64 bit cputime_t on 32 bit systems sched: Improve sysbench performance by fixing spurious active migration sched/x86: Fix up typo in topology detection x86, sched: Add new topology for multi-NUMA-node CPUs sched/rt: Use resched_curr() in task_tick_rt() sched: Use rq->rd in sched_setaffinity() under RCU read lock sched: cleanup: Rename 'out_unlock' to 'out_free_new_mask' sched: Use dl_bw_of() under RCU read lock sched/fair: Remove duplicate code from can_migrate_task() sched, mips, ia64: Remove __ARCH_WANT_UNLOCKED_CTXSW sched: print_rq(): Don't use tasklist_lock sched: normalize_rt_tasks(): Don't use _irqsave for tasklist_lock, use task_rq_lock() sched: Fix the task-group check in tg_has_rt_tasks() sched/fair: Leverage the idle state info when choosing the "idlest" cpu sched: Let the scheduler see CPU idle states sched/deadline: Fix inter- exclusive cpusets migrations sched/deadline: Clear dl_entity params when setscheduling to different class sched/numa: Kill the wrong/dead TASK_DEAD check in task_numa_fault() ...
2014-10-12Merge branch 'next' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security subsystem updates from James Morris. Mostly ima, selinux, smack and key handling updates. * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (65 commits) integrity: do zero padding of the key id KEYS: output last portion of fingerprint in /proc/keys KEYS: strip 'id:' from ca_keyid KEYS: use swapped SKID for performing partial matching KEYS: Restore partial ID matching functionality for asymmetric keys X.509: If available, use the raw subjKeyId to form the key description KEYS: handle error code encoded in pointer selinux: normalize audit log formatting selinux: cleanup error reporting in selinux_nlmsg_perm() KEYS: Check hex2bin()'s return when generating an asymmetric key ID ima: detect violations for mmaped files ima: fix race condition on ima_rdwr_violation_check and process_measurement ima: added ima_policy_flag variable ima: return an error code from ima_add_boot_aggregate() ima: provide 'ima_appraise=log' kernel option ima: move keyring initialization to ima_init() PKCS#7: Handle PKCS#7 messages that contain no X.509 certs PKCS#7: Better handling of unsupported crypto KEYS: Overhaul key identification when searching for asymmetric keys KEYS: Implement binary asymmetric key ID handling ...
2014-10-11Merge tag 'locks-v3.18-1' of git://git.samba.org/jlayton/linuxLinus Torvalds2-46/+59
Pull file locking related changes from Jeff Layton: "This release is a little more busy for file locking changes than the last: - a set of patches from Kinglong Mee to fix the lockowner handling in knfsd - a pile of cleanups to the internal file lease API. This should get us a bit closer to allowing for setlease methods that can block. There are some dependencies between mine and Bruce's trees this cycle, and I based my tree on top of the requisite patches in Bruce's tree" * tag 'locks-v3.18-1' of git://git.samba.org/jlayton/linux: (26 commits) locks: fix fcntl_setlease/getlease return when !CONFIG_FILE_LOCKING locks: flock_make_lock should return a struct file_lock (or PTR_ERR) locks: set fl_owner for leases to filp instead of current->files locks: give lm_break a return value locks: __break_lease cleanup in preparation of allowing direct removal of leases locks: remove i_have_this_lease check from __break_lease locks: move freeing of leases outside of i_lock locks: move i_lock acquisition into generic_*_lease handlers locks: define a lm_setup handler for leases locks: plumb a "priv" pointer into the setlease routines nfsd: don't keep a pointer to the lease in nfs4_file locks: clean up vfs_setlease kerneldoc comments locks: generic_delete_lease doesn't need a file_lock at all nfsd: fix potential lease memory leak in nfs4_setlease locks: close potential race in lease_get_mtime security: make security_file_set_fowner, f_setown and __f_setown void return locks: consolidate "nolease" routines locks: remove lock_may_read and lock_may_write lockd: rip out deferred lock handling from testlock codepath NFSD: Get reference of lockowner when coping file_lock ...
2014-10-08Merge branch 'for-3.18' of git://linux-nfs.org/~bfields/linuxLinus Torvalds17-314/+662
Pull nfsd updates from Bruce Fields: "Highlights: - support the NFSv4.2 SEEK operation (allowing clients to support SEEK_HOLE/SEEK_DATA), thanks to Anna. - end the grace period early in a number of cases, mitigating a long-standing annoyance, thanks to Jeff - improve SMP scalability, thanks to Trond" * 'for-3.18' of git://linux-nfs.org/~bfields/linux: (55 commits) nfsd: eliminate "to_delegation" define NFSD: Implement SEEK NFSD: Add generic v4.2 infrastructure svcrdma: advertise the correct max payload nfsd: introduce nfsd4_callback_ops nfsd: split nfsd4_callback initialization and use nfsd: introduce a generic nfsd4_cb nfsd: remove nfsd4_callback.cb_op nfsd: do not clear rpc_resp in nfsd4_cb_done_sequence nfsd: fix nfsd4_cb_recall_done error handling nfsd4: clarify how grace period ends nfsd4: stop grace_time update at end of grace period nfsd: skip subsequent UMH "create" operations after the first one for v4.0 clients nfsd: set and test NFSD4_CLIENT_STABLE bit to reduce nfsdcltrack upcalls nfsd: serialize nfsdcltrack upcalls for a particular client nfsd: pass extra info in env vars to upcalls to allow for early grace period end nfsd: add a v4_end_grace file to /proc/fs/nfsd lockd: add a /proc/fs/lockd/nlm_end_grace file nfsd: reject reclaim request when client has already sent RECLAIM_COMPLETE nfsd: remove redundant boot_time parm from grace_done client tracking op ...
2014-10-07locks: give lm_break a return valueJeff Layton1-8/+9
Christoph suggests: "Add a return value to lm_break so that the lock manager can tell the core code "you can delete this lease right now". That gets rid of the games with the timeout which require all kinds of race avoidance code in the users." Do that here and have the nfsd lease break routine use it when it detects that there was a race between setting up the lease and it being broken. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2014-10-07locks: move freeing of leases outside of i_lockJeff Layton1-3/+3
There was only one place where we still could free a file_lock while holding the i_lock -- lease_modify. Add a new list_head argument to the lm_change operation, pass in a private list when calling it, and fix those callers to dispose of the list once the lock has been dropped. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2014-10-07locks: define a lm_setup handler for leasesJeff Layton1-4/+2
...and move the fasync setup into it for fcntl lease calls. At the same time, change the semantics of how the file_lock double-pointer is handled. Up until now, on a successful lease return you got a pointer to the lock on the list. This is bad, since that pointer can no longer be relied on as valid once the inode->i_lock has been released. Change the code to instead just zero out the pointer if the lease we passed in ended up being used. Then the callers can just check to see if it's NULL after the call and free it if it isn't. The priv argument has the same semantics. The lm_setup function can zero the pointer out to signal to the caller that it should not be freed after the function returns. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2014-10-07locks: plumb a "priv" pointer into the setlease routinesJeff Layton1-2/+2
In later patches, we're going to add a new lock_manager_operation to finish setting up the lease while still holding the i_lock. To do this, we'll need to pass a little bit of info in the fcntl setlease case (primarily an fasync structure). Plumb the extra pointer into there in advance of that. We declare this pointer as a void ** to make it clear that this is private info, and that the caller isn't required to set this unless the lm_setup specifically requires it. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2014-10-07nfsd: don't keep a pointer to the lease in nfs4_fileJeff Layton2-10/+4
Now that we don't need to pass in an actual lease pointer to vfs_setlease on unlock, we can stop tracking a pointer to the lease in the nfs4_file. Switch all of the places that check the fi_lease to check fi_deleg_file instead. We always set that at the same time so it will have the same semantics. Cc: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2014-10-07locks: generic_delete_lease doesn't need a file_lock at allJeff Layton1-1/+1
Ensure that it's OK to pass in a NULL file_lock double pointer on a F_UNLCK request and convert the vfs_setlease F_UNLCK callers to do just that. Finally, turn the BUG_ON in generic_setlease into a WARN_ON_ONCE with an error return. That's a problem we can handle without crashing the box if it occurs. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2014-10-07nfsd: fix potential lease memory leak in nfs4_setleaseJeff Layton1-2/+5
It's unlikely to ever occur, but if there were already a lease set on the file then we could end up getting back a different pointer on a successful setlease attempt than the one we allocated. If that happens, the one we allocated could leak. In practice, I don't think this will happen due to the fact that we only try to set up the lease once per nfs4_file, but this error handling is a bit more correct given the current lease API. Cc: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2014-10-01nfsd: eliminate "to_delegation" defineJeff Layton3-7/+4
We now have cb_to_delegation and to_delegation, which do the same thing and are defined separately in different .c files. Move the cb_to_delegation definition into a header file and eliminate the redundant to_delegation definition. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jeff Layton <jlayton@primarydata.com>
2014-09-30nfsd4: fix corruption of NFSv4 read dataJ. Bruce Fields1-1/+2
The calculation of page_ptr here is wrong in the case the read doesn't start at an offset that is a multiple of a page. The result is that nfs4svc_encode_compoundres sets rq_next_page to a value one too small, and then the loop in svc_free_res_pages may incorrectly fail to clear a page pointer in rq_respages[]. Pages left in rq_respages[] are available for the next rpc request to use, so xdr data may be written to that page, which may hold data still waiting to be transmitted to the client or data in the page cache. The observed result was silent data corruption seen on an NFSv4 client. We tag this as "fixing" 05638dc73af2 because that commit exposed this bug, though the incorrect calculation predates it. Particular thanks to Andrea Arcangeli and David Gilbert for analysis and testing. Fixes: 05638dc73af2 "nfsd4: simplify server xdr->next_page use" Cc: stable@vger.kernel.org Reported-by: Andrea Arcangeli <aarcange@redhat.com> Tested-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-09-30Merge commit 'v3.16' into nextJames Morris1-1/+3
2014-09-29NFSD: Implement SEEKAnna Schumaker3-2/+97
This patch adds server support for the NFS v4.2 operation SEEK, which returns the position of the next hole or data segment in a file. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>