If block_read_full_page() detects an error when running get_block() it will
run SetPageError(), then it will zero out the block in pagecache and will mark
the buffer_head uptodate.
So at the end of readahead we end up with a non-uptodate pagecache page which
is marked PageError. But it has uptodate buffers.
The pagefault code will run ClearPageError, will launch readpage a second time
and block_read_full_page() will notice the uptodate buffers and will mark the
page uptodate as well. We end up with an uptodate, !PageError page full of
zeros and the error is lost.
(It seems a little odd that filemap_nopage() runs ClearPageError(). I guess
all of this adds up to meaning that for each attempted access to the page, the
pagefault handler will retry the I/O. Which is good and bad. If the app is
ignoring SIGBUS for some reason we could get a lot of back-to-back I/O
errors.)
Fix it by not marking the pagecache buffer_head as uptodate if the attempt to
map that buffer to a disk block failed.
Credit-to: Qu Fuping <fs@ercist.iscas.ac.cn>
For reporting the bug and identifying its source.
Signed-off-by: Qu Fuping <fs@ercist.iscas.ac.cn>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
VmallocTotal: 34359738367 kB
VmallocUsed: 266288 kB
VmallocChunk: 18014366299193295 kB
is unsettling - x86_64 and some other architectures keep a separate address
range for modules in vmalloc's vmlist, which /proc/meminfo should pass over.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This change from March 3rd causes the partition parsing code to ignore
partitions which have a signature byte of zero. Turns out that more people
have such partitions than we expected, and their device numbering is coming up
wrong in post-2.6.11 kernels.
So revert the change while we think about the problem a bit more.
Cc: Andries Brouwer <Andries.Brouwer@cwi.nl>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch fixes an off by one error found by the Coverity checker.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add some comments about task->comm, to explain what it is near its definition
and provide some important pointers to its uses.
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Use NULL instead of 0 for pointer (sparse warning):
fs/reiserfs/namei.c:611:50: warning: Using plain integer as NULL pointer
Signed-off-by: Randy Dunlap <rddunlap@osdl.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch makes some needlessly global identifiers static.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Arjan van de Ven <arjanv@infradead.org>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This had a fatal lock ranking bug: we do journal_start outside
mpage_writepages()'s lock_page().
Revert the whole thing, think again.
Credit-to: Jan Kara <jack@suse.cz>
For identifying the bug.
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The only caller that ever sets it can call fsync_bdev itself easily. Also
update some comments.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: <viro@parcelfarce.linux.theplanet.co.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
There's no longer a reason to document the obsolete BK usage.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The `last_bh' logic probably isn't worth much. In those situations where only
the front part of the page is being written out we will save some looping but
in the vastly more common case of an all-page writeout if just adds more code.
Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Remove all those get_bh()'s and put_bh()'s by extending lock_page() to cover
the troublesome regions.
(get_bh() and put_bh() happen every time whereas contention on a page's lock
in there happens basically never).
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When running
fsstress -v -d $DIR/tmp -n 1000 -p 1000 -l 2
on an ext2 filesystem with 1024 byte block size, on SMP i386 with 4096 byte
page size over loopback to an image file on a tmpfs filesystem, I would
very quickly hit
BUG_ON(!buffer_async_write(bh));
in fs/buffer.c:end_buffer_async_write
It seems that more than one request would be submitted for a given bh
at a time.
What would happen is the following:
2 threads doing __mpage_writepages on the same page.
Thread 1 - lock the page first, and enter __block_write_full_page.
Thread 1 - (eg.) mark_buffer_async_write on the first 2 buffers.
Thread 1 - set page writeback, unlock page.
Thread 2 - lock page, wait on page writeback
Thread 1 - submit_bh on the first 2 buffers.
=> both requests complete, none of the page buffers are async_write,
end_page_writeback is called.
Thread 2 - wakes up. enters __block_write_full_page.
Thread 2 - mark_buffer_async_write on (eg.) the last buffer
Thread 1 - finds the last buffer has async_write set, submit_bh on that.
Thread 2 - submit_bh on the last buffer.
=> oops.
So change __block_write_full_page to explicitly keep track of the last bh
we need to issue, so we don't touch anything after issuing the last
request.
Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix a race where __block_prepare_write can leak out an in-flight read
against a bh if get_block returns an error. This can lead to the page
becoming unlocked while the buffer is locked and the read still in flight.
__mpage_writepage BUGs on this condition.
BUG sighted on a 2-way Itanium2 system with 16K PAGE_SIZE running
fsstress -v -d $DIR/tmp -n 1000 -p 1000 -l 2
where $DIR is a new ext2 filesystem with 4K blocks that is quite
small (causing get_block to fail often with -ENOSPC).
Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This cleans up the error handling and fixes a crash if a hostfs mount fails.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This makes sure that reclaimable buffer headers and reclaimable inodes
are accounted properly during the overcommit checks.
Signed-off-by: Andrea Arcangeli <andrea@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
handling for unwritten extents can be moved out of interrupt context.
SGI Modid: xfs-linux:xfs-kern:22343a
Signed-off-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Christoph Hellwig <hch@sgi.com>
Modify xtSearch so that it returns the next allocated block when the
requested block is unmapped. This can be used to make sure we don't
create a new extent that overlaps the next one.
Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch adds jfs_syncpt, which calls lmLogSync to write sync points
to the journal both in jfs_sync_fs and when sync barrier processing
completes.
lmLogSync accomplishes two things: 1) it pushes logged-but-dirty
metadata pages to disk, and 2) it writes a sync record to the journal
so that jfs_fsck doesn't need to replay more transactions than is
necessary.
Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
jfs has never worked on architecutures where the page size was not 4K.
Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
JFS code has always assumed a page size of 4K. This patch fixes the
non-pagecache uses of pages to deal with larger pages.
Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
JFS was creating a new IAG (inode aggregate group) in one address
space, and afterwards, accessing it from another. This could lead to
complications when cache pages contain more than one page of jfs
metadata. This patch causes the IAG to be initialized in the same
address space that it is subsequently accessed with.
This also elimitates an I/O, but IAG's aren't created too often.
Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Use an inline pxd list rather than an xad list in the xadlock.
When the number of extents being modified can fit with the xadlock,
a transaction can be committed asynchronously. Using a list of
pxd's instead of xad's allows us to fit 4 extents, rather than 2.
Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Some KernelDoc descriptions are updated to match the current code.
No code changes.
Signed-off-by: Martin Waitz <tali@admingilde.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
I have recompiled Linux kernel 2.6.11.5 documentation for me and our
university students again. The documentation could be extended for more
sources which are equipped by structured comments for recent 2.6 kernels. I
have tried to proceed with that task. I have done that more times from 2.6.0
time and it gets boring to do same changes again and again. Linux kernel
compiles after changes for i386 and ARM targets. I have added references to
some more files into kernel-api book, I have added some section names as well.
So please, check that changes do not break something and that categories are
not too much skewed.
I have changed kernel-doc to accept "fastcall" and "asmlinkage" words reserved
by kernel convention. Most of the other changes are modifications in the
comments to make kernel-doc happy, accept some parameters description and do
not bail out on errors. Changed <pid> to @pid in the description, moved some
#ifdef before comments to correct function to comments bindings, etc.
You can see result of the modified documentation build at
http://cmp.felk.cvut.cz/~pisa/linux/lkdb-2.6.11.tar.gz
Some more sources are ready to be included into kernel-doc generated
documentation. Sources has been added into kernel-api for now. Some more
section names added and probably some more chaos introduced as result of quick
cleanup work.
Signed-off-by: Pavel Pisa <pisa@cmp.felk.cvut.cz>
Signed-off-by: Martin Waitz <tali@admingilde.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The extra race-with-truncate-then-retry logic around
ext3_get_block_handle(), which was inherited from ext2, becomes unecessary
for ext3, since we have already obtained the ei->truncate_sem in
ext3_get_block_handle() before calling ext3_alloc_branch(). The
ei->truncate_sem is already there to block concurrent truncate and block
allocation on the same inode. So the inode's indirect addressing tree
won't be changed after we grab that semaphore.
We could, after get the semaphore, re-verify the branch is up-to-date or
not. If it has been changed, then get the updated branch. If we still
need block allocation, we will have a safe version of the branch to work
with in the ext3_find_goal()/ext3_splice_branch().
The code becomes more readable after remove those retry logic. The patch
also clean up some gotos in ext3_get_block_handle() to make it more
readable.
Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
comp_short_keys() massaged into sane form, which kills the last place where
pointer to in_core_key (or any object containing such) would be cast to or
from something else. At that point we are free to change layout of
in_core_key - nothing depends on it anymore.
So we drop the mess with union in there and simply use (unconditional) __u64
k_offset and __u8 k_type instead; places using in_core_key switched to those.
That gives _far_ better code than current mess - on all platforms.
Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk>
Cc: <reiserfs-dev@namesys.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
fixes for a couple of bugs exposed by the above: le32_to_cpu() used on 16bit
value and missing conversion in comparison of host- and little-endian values.
Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk>
Cc: <reiserfs-dev@namesys.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
little-endian objects annotated as such; again, obviously no changes of
resulting code, we only replace __u16 with __le16, etc. in relevant places.
Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk>
Cc: <reiserfs-dev@namesys.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
struct reiserfs_key cloned; (currently) identical struct in_core_key added.
Places that expect host-endian data in reiserfs_key switched to in_core_key.
Basically, we get annotation of reiserfs_key users and keep the resulting tree
obviously equivalent to original.
Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk>
Cc: <reiserfs-dev@namesys.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
For tree mount maps, a call to chdir or chroot, to a directory above the
moint point directories at a certain time during the expire results in the
expire incorrectly thinking the tree is not busy. This patch adds a check
to see if the filesystem above the tree mount points is busy and also locks
the filesystem during the tree mount expire to prevent the race.
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
It's possible for an event wait request to arive before the event
requestor. If this happens the daemon never gets notified and autofs
hangs.
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch fixes the leak of sb->s_fs_info in both the HFS and HFS+
modules. In addition to this, it fixes an oops happening when trying to
mount a non-hfsplus filesystem using hfsplus. This patch is from Roman
Zippel, based off patches sent by myself.
Signed-off-by: Colin Leroy <colin@colino.net>
Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>