summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)AuthorFilesLines
2008-10-03inotify: fix lock ordering wrt do_page_fault's mmap_semNick Piggin1-7/+20
Fix inotify lock order reversal with mmap_sem due to holding locks over copy_to_user. Signed-off-by: Nick Piggin <npiggin@suse.de> Reported-by: "Daniel J Blueman" <daniel.blueman@gmail.com> Tested-by: "Daniel J Blueman" <daniel.blueman@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-09-29mm owner: fix race between swapoff and exitBalbir Singh1-1/+1
There's a race between mm->owner assignment and swapoff, more easily seen when task slab poisoning is turned on. The condition occurs when try_to_unuse() runs in parallel with an exiting task. A similar race can occur with callers of get_task_mm(), such as /proc/<pid>/<mmstats> or ptrace or page migration. CPU0 CPU1 try_to_unuse looks at mm = task0->mm increments mm->mm_users task 0 exits mm->owner needs to be updated, but no new owner is found (mm_users > 1, but no other task has task->mm = task0->mm) mm_update_next_owner() leaves mmput(mm) decrements mm->mm_users task0 freed dereferencing mm->owner fails The fix is to notify the subsystem via mm_owner_changed callback(), if no new owner is found, by specifying the new task as NULL. Jiri Slaby: mm->owner was set to NULL prior to calling cgroup_mm_owner_callbacks(), but must be set after that, so as not to pass NULL as old owner causing oops. Daisuke Nishimura: mm_update_next_owner() may set mm->owner to NULL, but mem_cgroup_from_task() and its callers need to take account of this situation to avoid oops. Hugh Dickins: Lockdep warning and hang below exec_mmap() when testing these patches. exit_mm() up_reads mmap_sem before calling mm_update_next_owner(), so exec_mmap() now needs to do the same. And with that repositioning, there's now no point in mm_need_new_owner() allowing for NULL mm. Reported-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com> Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Paul Menage <menage@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-09-29Fix NULL pointer dereference in proc_sys_compareLinus Torvalds1-4/+6
The VFS interface for the 'd_compare()' is a bit special (read: 'odd'), because it really just essentially replaces a memcmp(). The filesystem is supposed to just compare the two names with whatever case-independent or other function. And when I say 'is supposed to', I obviously mean that 'procfs does odd things, and actually looks at the dentry that we don't even pass down, rather than just the name'. Which results in problems, because we actually call d_compare before we have even verified that the dentry is still hashed at all. And that causes a problm since the inode that procfs looks at may have been free'd and the d_inode pointer is NULL. procfs just assumes that all dentries are positive, since procfs itself never generates a negative one. But memory pressure will still result in the dentry getting torn down, and as it is removed by RCU, it still remains visible on some lists - and to d_compare. If the filesystem just did a name comparison, we wouldn't care. And we could just fix procfs to know about negative dentries too. But rather than have the low-level filesystems know about internal VFS details, just move the check for a unhashed dentry up a bit, so that we will only call d_compare on dentries that are still active. The actual oops this caused didn't look like a NULL pointer dereference because procfs did a 'container_of(inode, struct proc_inode, vfs_inode)' to get at its internal proc_inode information from the inode pointer, and accessed a field below the inode. So the oops would look something like BUG: unable to handle kernel paging request at fffffffffffffff0 IP: [<ffffffff802bc6c6>] proc_sys_compare+0x36/0x50 and was seen on both x86-64 (Alexey Dobriyan and Hugh Dickins) and ppc64 (Hugh Dickins). Reported-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Hugh Dickins <hugh@veritas.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-of-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-09-26Merge git://oss.sgi.com:8090/xfs/linux-2.6Linus Torvalds1-91/+3
* git://oss.sgi.com:8090/xfs/linux-2.6: [XFS] Remove xfs_iext_irec_compact_full() [XFS] Fix extent list corruption in xfs_iext_irec_compact_full().
2008-09-26Merge branch 'linux-next' of git://git.infradead.org/~dedekind/ubifs-2.6Linus Torvalds6-9/+15
* 'linux-next' of git://git.infradead.org/~dedekind/ubifs-2.6: UBIFS: fix printk format warnings UBIFS: remove incorrect assert UBIFS: TNC / GC race fixes UBIFS: create the name of the background thread in every case
2008-09-26GFS2: Support for I/O barriersSteven Whitehouse2-3/+19
This patch adds barrier support to GFS2. There is not a lot of change really... we just add the barrier flag when we write journal header blocks. If the underlying device refuses to support them, we fall back to the previous way of doing things (wait for the I/O and hope) since there is nothing else we can do. There is no user configuration, barriers will always be on unless the device refuses to support them. This seems a reasonable solution to me since this is a correctness issue. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-09-26[XFS] Remove xfs_iext_irec_compact_full()Lachlan McIlroy1-92/+3
Yet another bug was found in xfs_iext_irec_compact_full() and while the source of the bug was found it wasn't an easy task to track it down because the conditions are very difficult to reproduce. A HUGE thank-you goes to Russell Cattelan and Eric Sandeen for their significant effort in tracking down the source of this corruption. xfs_iext_irec_compact_full() and xfs_iext_irec_compact_pages() are almost identical - they both compact indirect extent lists by moving extents from subsequent buffers into earlier ones. xfs_iext_irec_compact_pages() only moves extents if all of the extents in the next buffer will fit into the empty space in the buffer before it. xfs_iext_irec_compact_full() will go a step further and move part of the next buffer if all the extents wont fit. It will then shift the remaining extents in the next buffer up to the start of the buffer. The bug here was that we did not update er_extoff and this caused extent list corruption. It does not appear that this extra functionality gains us much. Calling xfs_iext_irec_compact_pages() instead will do a good enough job at compacting the indirect list and will be quicker too. For the case in xfs_iext_indirect_to_direct() the total number of extents in the indirect list will fit into one buffer so we will never need the extra functionality of xfs_iext_irec_compact_full() there. Also xfs_iext_irec_compact_pages() doesn't need to do a memmove() (the buffers will never overlap) so we don't want the performance hit that can incur. SGI-PV: 987159 SGI-Modid: xfs-linux-melb:xfs-kern:32166a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
2008-09-26[XFS] Fix extent list corruption in xfs_iext_irec_compact_full().Lachlan McIlroy1-0/+1
If we don't move all the records from the next buffer into the current buffer then we need to update the er_extoff field of the next buffer as we shift the remaining records to the start of the buffer. SGI-PV: 987159 SGI-Modid: xfs-linux-melb:xfs-kern:32165a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Russell Cattelan <cattelan@thebarn.com>
2008-09-259p: use an IS_ERR test rather than a NULL testJulien Brunel1-2/+1
In case of error, the function p9_client_walk returns an ERR pointer, but never returns a NULL pointer. So a NULL test that comes after an IS_ERR test should be deleted. The semantic match that finds this problem is as follows: (http://www.emn.fr/x-info/coccinelle/) // <smpl> @match_bad_null_test@ expression x, E; statement S1,S2; @@ x = p9_client_walk(...) ... when != x = E * if (x != NULL) S1 else S2 // </smpl> Signed-off-by: Julien Brunel <brunel@diku.dk> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2008-09-25cifs: explicitly revoke SPNEGO key after session setupJeff Layton1-1/+3
cifs: explicitly revoke SPNEGO key after session setup The SPNEGO blob returned by an upcall can only be used once. Explicitly revoke it to make sure that we never pick it up again after session setup exits. This doesn't seem to be that big an issue on more recent kernels, but older kernels seem to link keys into the session keyring by default. That said, explicitly revoking the key seems like a reasonable thing to do here. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-09-24cifs: Convert cifs to new aops.Nick Piggin1-61/+59
cifs: Convert cifs to new aops. This patch is based on the one originally posted by Nick Piggin. His patch was very close, but had a couple of small bugs. Nick's original comments follow: This is another relatively naive conversion. Always do the read upfront when the page is not uptodate (unless we're in the writethrough path). Fix an uninitialized data exposure where SetPageUptodate was called before the page was uptodate. SetPageUptodate and switch to writeback mode in the case that the full page was dirtied. Acked-by: Shaggy <shaggy@austin.ibm.com> Acked-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-09-24[CIFS] update DOS attributes in cifsInode if we successfully changed themSteve French1-0/+4
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-09-24cifs: remove NULL termination from rename target in CIFSSMBRenameOpenFIleJeff Layton2-3/+3
cifs: remove NULL termination from rename target in CIFSSMBRenameOpenFIle The rename destination isn't supposed to be null terminated. Also, change the name string arg to be const. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-09-24cifs: work around samba returning -ENOENT on SetFileDisposition callJeff Layton1-2/+10
cifs: work around samba returning -ENOENT on SetFileDisposition call Samba seems to return STATUS_OBJECT_NAME_NOT_FOUND when we try to set the delete on close bit after doing a rename by filehandle. This looks like a samba bug to me, but a lot of servers will do this. For now, pretend an -ENOENT return is a success. Samba does however seem to respect the CREATE_DELETE_ON_CLOSE bit when opening files that already exist. Windows will ignore it, but so adding it to the open flags should be harmless. We're also currently ignoring the return code on the rename by filehandle, so no need to set rc based on it. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-09-24cifs: fix inverted NULL check after kmallocJeff Layton1-1/+1
cifs: fix inverted NULL check after kmalloc Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-09-23[CIFS] clean up upcall handling for dns_resolver keysSteve French1-33/+41
We're given the datalen in the downcall, so there's no need to do any calls to strlen(). Just keep track of the datalen in the key. Finally, add a sanity check of the data in the downcall to make sure that it looks like a real IP address. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-09-23[CIFS] fix busy-file renames and refactor cifs_rename logicSteve French1-82/+104
Break out the code that does the actual renaming into a separate function and have cifs_rename call that. That function will attempt a path based rename first and then do a filehandle based one if it looks like the source is busy. The existing logic tried a path based rename first, but if we needed to remove the destination then it only attempted a filehandle based rename afterward. Not all servers support renaming by filehandle, so we need to always attempt path rename first and fall back to filehandle rename if it doesn't work. This also fixes renames of open files on windows servers (at least when the source and destination directories are the same). CC: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-09-23cifs: add function to set file dispositionJeff Layton3-2/+64
cifs: add function to set file disposition The proper way to set the delete on close bit on an already existing file is to use SET_FILE_INFO with an infolevel of SMB_FILE_DISPOSITION_INFO. Add a function to do that and have the silly-rename code use it. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-09-23[CIFS] add constants for string lengths of keynames in SPNEGO upcall stringSteve French1-9/+26
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-09-23cifs: move rename and delete-on-close logic into helper functionJeff Layton1-39/+59
cifs: move rename and delete-on-close logic into helper function When a file is still open on the server, we attempt to set the DELETE_ON_CLOSE bit and rename it to a new filename. When the last opener closes the file, the server should delete it. This patch moves this mechanism into a helper function and has the two places in cifs_unlink that do this procedure call it. It also fixes the open flags to be correct. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-09-23cifs: have find_writeable_file prefer filehandles opened by same taskJeff Layton1-1/+9
When the CIFS client goes to write out pages, it needs to pick a filehandle to write to. find_writeable_file however just picks the first filehandle that it finds. This can cause problems when a lock is issued against a particular filehandle and we pick a different filehandle to write to. This patch tries to avert this situation by having find_writable_file prefer filehandles that have a pid that matches the current task. This seems to fix lock test 11 from the connectathon test suite when run against a windows server. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-09-23cifs: don't use GFP_KERNEL with GFP_NOFSPekka Enberg2-6/+3
GFP_KERNEL and GFP_NOFS are mutually exclusive. If you combine them, you end up with plain GFP_KERNEL which can deadlock in cases where you really want GFP_NOFS. Cc: Steve French <sfrench@samba.org> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-09-18GFS2: high time to take some time over atimeSteven Whitehouse10-167/+65
Until now, we've used the same scheme as GFS1 for atime. This has failed since atime is a per vfsmnt flag, not a per fs flag and as such the "noatime" flag was not getting passed down to the filesystems. This patch removes all the "special casing" around atime updates and we simply use the VFS's atime code. The net result is that GFS2 will now support all the same atime related mount options of any other filesystem on a per-vfsmnt basis. We do lose the "lazy atime" updates, but we gain "relatime". We could add lazy atime to the VFS at a later date, if there is a requirement for that variant still - I suspect relatime will be enough. Also we lose about 100 lines of code after this patch has been applied, and I have a suspicion that it will speed things up a bit, even when atime is "on". So it seems like a nice clean up as well. From a user perspective, everything stays the same except the loss of the per-fs atime quantum tweekable (ought to be per-vfsmnt at the very least, and to be honest I don't think anybody ever used it) and that a number of options which were ignored before now work correctly. Please let me know if you've got any comments. I'm pushing this out early so that you can all see what my plans are. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-09-18GFS2: The war on bloatSteven Whitehouse1-15/+15
The following patch shrinks the gfs2_args structure which is embedded in every GFS2 superblock. It cuts down the size of the options to a single unsigned int (the 13 bits of bitfields will be rounded up to that size by the compiler) from the current 11 unsigned ints. So on x86 thats 44 bytes shrinking to 4 bytes, in each and every GFS2 superblock. Signed-off-by: Steven Whitehouse <swhitho@redhat.com>
2008-09-18UBIFS: fix printk format warningsAlexander Beregalov2-2/+2
fs/ubifs/dir.c:428: warning: format '%llu' expects type 'long long unsigned int', but argument 5 has type 'long unsigned int' fs/ubifs/debug.c:541: warning: format '%llu' expects type 'long long unsigned int', but argument 2 has type 'long unsigned int' Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2008-09-17UBIFS: remove incorrect assertAdrian Hunter1-1/+0
The assert was not valid because one of the variables 'taken_empty_lebs' has transient values out of sync with the other variables. Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2008-09-17UBIFS: TNC / GC race fixesAdrian Hunter2-4/+12
- update GC sequence number if any nodes may have been moved even if GC did not finish the LEB - don't ignore error return when reading Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2008-09-17UBIFS: create the name of the background thread in every caseSebastian Siewior1-2/+1
If the ubifs partition is mounted RO and then remounted RW we end up with no thread name in ubifs_remount_rw() and the thread appears nameless. Signed-off-by: Sebastian Siewior <bigeasy@linutronix.de> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2008-09-17[XFS] Don't do I/O beyond eof when unreserving spaceLachlan McIlroy1-0/+18
When unreserving space with boundaries that are not block aligned we round up the start and round down the end boundaries and then use this function, xfs_zero_remaining_bytes(), to zero the parts of the blocks that got dropped during the rounding. The problem is we don't consider if these blocks are beyond eof. Worse still is if we encounter delayed allocations beyond eof we will try to use the magic delayed allocation block number as a real block number. If the file size is ever extended to expose these blocks then we'll go through xfs_zero_eof() to zero them anyway. SGI-PV: 983683 SGI-Modid: xfs-linux-melb:xfs-kern:32055a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-09-17[XFS] Fix use-after-free with buffersLachlan McIlroy1-24/+20
We have a use-after-free issue where log completions access buffers via the buffer log item and the buffer has already been freed. Fix this by taking a reference on the buffer when attaching the buffer log item and release the hold when the buffer log item is detached and we no longer need the buffer. Also create a new function xfs_buf_item_free() to combine some common code. SGI-PV: 985757 SGI-Modid: xfs-linux-melb:xfs-kern:32025a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-09-17[XFS] Prevent lockdep false positives when locking two inodes.David Chinner2-1/+16
If we call xfs_lock_two_inodes() to grab both the iolock and the ilock, then drop the ilocks on both inodes, then grab them again (as xfs_swap_extents() does) then lockdep will report a locking order problem. This is a false positive. To avoid this, disallow xfs_lock_two_inodes() fom locking both inode locks at once - force calers to make two separate calls. This means that nested dropping and regaining of the ilocks will retain the same lockdep subclass and so lockdep will not see anything wrong with this code. SGI-PV: 986238 SGI-Modid: xfs-linux-melb:xfs-kern:31999a Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Peter Leckie <pleckie@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-09-17[XFS] Fix barrier status change detection.David Chinner1-1/+1
The current code in xlog_iodone() uses the wrong macro to check if the barrier has been cleared due to an EOPNOTSUPP error form the lower layer. SGI-PV: 986143 SGI-Modid: xfs-linux-melb:xfs-kern:31984a Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Nathaniel W. Turner <nate@houseofnate.net> Signed-off-by: Peter Leckie <pleckie@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-09-17[XFS] Prevent direct I/O from mapping extents beyond eofLachlan McIlroy1-0/+4
With the help from some tracing I found that we try to map extents beyond eof when doing a direct I/O read. It appears that the way to inform the generic direct I/O path (ie do_direct_IO()) that we have breached eof is to return an unmapped buffer from xfs_get_blocks_direct(). This will cause do_direct_IO() to jump to the hole handling code where is will check for eof and then abort. This problem was found because a direct I/O read was trying to map beyond eof and was encountering delayed allocations. The delayed allocations beyond eof are speculative allocations and they didn't get converted when the direct I/O flushed the file because there was only enough space in the current AG to convert and write out the dirty pages within eof. Note that xfs_iomap_write_allocate() wont necessarily convert all the delayed allocation passed to it - it will return after allocating the first extent - so if the delayed allocation extends beyond eof then it will stay that way. SGI-PV: 983683 SGI-Modid: xfs-linux-melb:xfs-kern:31929a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-09-17[XFS] Fix regression introduced by remount fixupChristoph Hellwig1-0/+20
Logically we would return an error in xfs_fs_remount code to prevent users from believing they might have changed mount options using remount which can't be changed. But unfortunately mount(8) adds all options from mtab and fstab to the mount arguments in some cases so we can't blindly reject options, but have to check for each specified option if it actually differs from the currently set option and only reject it if that's the case. Until that is implemented we return success for every remount request, and silently ignore all options that we can't actually change. SGI-PV: 985710 SGI-Modid: xfs-linux-melb:xfs-kern:31908a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Tim Shimmin <tes@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-09-17[XFS] Move memory allocations for log tracing out of the critical pathLachlan McIlroy2-21/+40
Memory allocations for log->l_grant_trace and iclog->ic_trace are done on demand when the first event is logged. In xlog_state_get_iclog_space() we call xlog_trace_iclog() under a spinlock and allocating memory here can cause us to sleep with a spinlock held and deadlock the system. For the log grant tracing we use KM_NOSLEEP but that means we can lose trace entries. Since there is no locking to serialize the log grant tracing we could race and have multiple allocations and leak memory. So move the allocations to where we initialize the log/iclog structures. Use KM_NOFS to avoid recursing into the filesystem and drop log->l_trace since it's not even used. SGI-PV: 983738 SGI-Modid: xfs-linux-melb:xfs-kern:31896a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-09-17[CIFS] use common code for turning off ATTR_READONLY in cifs_unlinkSteve French1-167/+139
We already have a cifs_set_file_info function that can flip DOS attribute bits. Have cifs_unlink call it to handle turning ATTR_HIDDEN on and ATTR_READONLY off when an unlink attempt returns -EACCES. This also removes a level of indentation from cifs_unlink. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-09-17cifs: clean up variables in cifs_unlinkJeff Layton2-50/+42
Change parameters to cifs_unlink to match the ones used in the generic VFS. Add some local variables to cut down on the amount of struct dereferencing that needs to be done, and eliminate some unneeded NULL pointer checks on the parent directory inode. Finally, rename pTcon to "tcon" to more closely match standard kernel coding style. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-09-15GFS2: GFS2 will panic if you misspell any mount optionsAbhijith Das1-6/+9
The gfs2 superblock pointer is NULL after a failed mount. When control eventually goes to gfs2_kill_sb, we dereference this NULL pointer. This patch ensures that the gfs2 superblock pointer is not NULL before being dereferenced in gfs2_kill_sb. Signed-off-by: Abhijith Das <adas@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-09-15GFS2: Direct IO write at end of file errorBob Peterson1-1/+1
This patch fixes a problem whereby a direct_io write doesn't fall back to buffered write properly at end of file. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-09-14rescan_partitions(): make device capacity errors non-fatalAndrew Morton1-2/+2
Herton Krzesinski reports that the error-checking changes in 04ebd4aee52b06a2c38127d9208546e5b96f3a19 ("block/ioctl.c and fs/partition/check.c: check value returned by add_partition") cause his buggy USB camera to no longer mount. "The camera is an Olympus X-840. The original issue comes from the camera itself: its format program creates a partition with an off by one error". Buggy devices happen. It is better for the kernel to warn and to proceed with the mount. Reported-by: Herton Ronaldo Krzesinski <herton@mandriva.com.br> Cc: Abdel Benamrouche <draconux@gmail.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: David Brownell <david-b@pacbell.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-09-14mm: ifdef Quicklists in /proc/meminfoHugh Dickins1-4/+8
A "Quicklists: 0 kB" line has just started appearing in /proc/meminfo, but most architectures (including x86) don't have them configured, so #ifdef it, like the highmem lines. And those architectures which do have quicklists configured are using them for page tables: so let's place it next to PageTables. Signed-off-by: Hugh Dickins <hugh@veritas.com> Acked-by: Christoph Lameter <cl@linux-foundation.org> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-09-14bfs: fix Lockdep warningEric Sesterhenn1-1/+1
This fixes: ============================================= [ INFO: possible recursive locking detected ] 2.6.27-rc5-00283-g70bb089 #68 --------------------------------------------- touch/6855 is trying to acquire lock: (&info->bfs_lock){--..}, at: [<c02262f5>] bfs_delete_inode+0x9e/0x18c but task is already holding lock: (&info->bfs_lock){--..}, at: [<c0226c00>] bfs_create+0x45/0x187 other info that might help us debug this: 2 locks held by touch/6855: #0: (&type->i_mutex_dir_key#5){--..}, at: [<c018ad13>] do_filp_open+0x10b/0x62f #1: (&info->bfs_lock){--..}, at: [<c0226c00>] bfs_create+0x45/0x187 stack backtrace: Pid: 6855, comm: touch Not tainted 2.6.27-rc5-00283-g70bb089 #68 [<c013e769>] validate_chain+0x458/0x9f4 [<c013bece>] ? trace_hardirqs_off+0xb/0xd [<c013f36b>] __lock_acquire+0x666/0x6e0 [<c013f440>] lock_acquire+0x5b/0x77 [<c02262f5>] ? bfs_delete_inode+0x9e/0x18c [<c06aab74>] mutex_lock_nested+0xbc/0x234 [<c02262f5>] ? bfs_delete_inode+0x9e/0x18c [<c02262f5>] ? bfs_delete_inode+0x9e/0x18c [<c02262f5>] bfs_delete_inode+0x9e/0x18c [<c0226257>] ? bfs_delete_inode+0x0/0x18c [<c01925e1>] generic_delete_inode+0x94/0xfe [<c019265d>] generic_drop_inode+0x12/0x12f [<c0191b7e>] iput+0x4b/0x4e [<c0226d1e>] bfs_create+0x163/0x187 [<c0188b42>] vfs_create+0xa6/0x114 [<c018adb5>] do_filp_open+0x1ad/0x62f [<c0107cdc>] ? native_sched_clock+0x82/0x96 [<c06ac309>] ? _spin_unlock+0x27/0x3c [<c019379e>] ? alloc_fd+0xbf/0xc9 [<c06ae2f4>] ? sub_preempt_count+0x9d/0xab [<c019379e>] ? alloc_fd+0xbf/0xc9 [<c0180391>] do_sys_open+0x42/0xb8 [<c041d564>] ? trace_hardirqs_on_thunk+0xc/0x10 [<c0180449>] sys_open+0x1e/0x26 [<c01038bd>] sysenter_do_call+0x12/0x31 ======================= The problem is that we don't unlock the bfs->lock mutex before calling iput (we do in the other cases). Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de> Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-09-14proc: more debugging for "already registered" caseAlexey Dobriyan1-2/+2
Print parent directory name as well. The aim is to catch non-creation of parent directory when proc_mkdir will return NULL and all subsequent registrations go directly in /proc instead of intended directory. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> [ Fixed insane printk string while at it. - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-09-11Merge branch 'for_linus' of ↵Linus Torvalds2-25/+20
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6: udf: add llseek method udf: Fix error paths in udf_new_inode() udf: Fix lock inversion between iprune_mutex and alloc_mutex (v2)
2008-09-10ocfs2: Fix a bug in direct IO read.Tao Ma1-1/+1
ocfs2 will become read-only if we try to read the bytes which pass the end of i_size. This can be easily reproduced by following steps: 1. mkfs a ocfs2 volume with bs=4k cs=4k and nosparse. 2. create a small file(say less than 100 bytes) and we will create the file which is allocated 1 cluster. 3. read 8196 bytes from the kernel using O_DIRECT which exceeds the limit. 4. The ocfs2 volume becomes read-only and dmesg shows: OCFS2: ERROR (device sda13): ocfs2_direct_IO_get_blocks: Inode 66010 has a hole at block 1 File system is now read-only due to the potential of on-disk corruption. Please run fsck.ocfs2 once the file system is unmounted. So suppress the ERROR message. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2008-09-09Merge branch 'linux-next' of git://git.infradead.org/~dedekind/ubifs-2.6Linus Torvalds10-141/+221
* 'linux-next' of git://git.infradead.org/~dedekind/ubifs-2.6: UBIFS: make minimum fanout 3 UBIFS: fix division by zero UBIFS: amend f_fsid UBIFS: fill f_fsid UBIFS: improve statfs reporting even more UBIFS: introduce LEB overhead UBIFS: add forgotten gc_idx_lebs component UBIFS: fix assertion UBIFS: improve statfs reporting UBIFS: remove incorrect index space check UBIFS: push empty flash hack down UBIFS: do not update min_idx_lebs in stafs UBIFS: allow for racing between GC and TNC UBIFS: always read hashed-key nodes under TNC mutex UBIFS: fix zero-length truncations
2008-09-09NFS: Restore missing hunk in NFS mount option parserChuck Lever1-0/+6
Automounter maps can contain mount options valid for other NFS implementations but not for Linux. The Linux automounter uses the mount command's "-s" command line option ("s" for "sloppy") so that mount requests containing such options are not rejected. Commit f45663ce5fb30f76a3414ab3ac69f4dd320e760a attempted to address a known regression with text-based NFS mount option parsing. Unrecognized mount options would cause mount requests to fail, even if the "-s" option was used on the mount command line. Unfortunately, this commit was not complete as submitted. It adds a new mount option, "sloppy". But it is missing a hunk, so it now allows NFS mounts with unrecognized mount options, even if the "sloppy" option is not present. This could be a problem if a required critical mount option such as "sync" is misspelled, for example, and is considered a regression from 2.6.26. This patch restores the missing hunk. Now, the default behavior of text-based NFS mount options is as before: any unrecognized mount option will cause the mount to fail. Please include this in 2.6.27-rc. Thanks to Neil Brown for reporting this. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Acked-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-09-08udf: add llseek methodChristoph Hellwig1-0/+1
UDF currently doesn't set a llseek method for regular files, which means it will fall back to default_llseek. This means no one can seek beyond 2 Gigabytes on udf, and that there's not protection vs the i_size updates from writers. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz>
2008-09-05UBIFS: make minimum fanout 3Artem Bityutskiy1-1/+1
UBIFS does not really work correctly when fanout is 2, because of the way we manage the indexing tree. It may just become a list and UBIFS screws up. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2008-09-05UBIFS: fix division by zeroArtem Bityutskiy1-6/+7
If fanout is 3, we have division by zero in 'ubifs_read_superblock()': divide error: 0000 [#1] PREEMPT SMP Pid: 28744, comm: mount Not tainted (2.6.27-rc4-ubifs-2.6 #23) EIP: 0060:[<f8f9e3ef>] EFLAGS: 00010202 CPU: 0 EIP is at ubifs_reported_space+0x2d/0x69 [ubifs] EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: 00000000 ESI: 00000000 EDI: f0ae64b0 EBP: f1f9fcf4 ESP: f1f9fce0 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>