* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (52 commits)
init: Open /dev/console from rootfs
mqueue: fix typo "failues" -> "failures"
mqueue: only set error codes if they are really necessary
mqueue: simplify do_open() error handling
mqueue: apply mathematics distributivity on mq_bytes calculation
mqueue: remove unneeded info->messages initialization
mqueue: fix mq_open() file descriptor leak on user-space processes
fix race in d_splice_alias()
set S_DEAD on unlink() and non-directory rename() victims
vfs: add NOFOLLOW flag to umount(2)
get rid of ->mnt_parent in tomoyo/realpath
hppfs can use existing proc_mnt, no need for do_kern_mount() in there
Mirror MS_KERNMOUNT in ->mnt_flags
get rid of useless vfsmount_lock use in put_mnt_ns()
Take vfsmount_lock to fs/internal.h
get rid of insanity with namespace roots in tomoyo
take check for new events in namespace (guts of mounts_poll()) to namespace.c
Don't mess with generic_permission() under ->d_lock in hpfs
sanitize const/signedness for udf
nilfs: sanitize const/signedness in dealing with ->d_name.name
...
Fix up fairly trivial (famous last words...) conflicts in
drivers/infiniband/core/uverbs_main.c and security/tomoyo/realpath.c
This adds reader's lock for the_nilfs->cno in nilfs_ioctl_sync,
for the_nilfs->cno should be proctected by segctor_sem when reading.
Signed-off-by: Jiro SEKIBA <jir@unicus.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This is a trivial patch to remove unnecessary condition.
load_segment_summary() checks crc of segment_summary OR crc of whole
log data blocks based on boolean argument full_check. However,
callers of the function pass only 1 as full_check, which means only
whole log data blocks checking code is running all the time.
This patch deletes the condition and full_check argument and also
deletes enum 'NILFS_SEG_FAIL_CHECKSUM_SEGSUM' and corresponding case
clause, for it is nolonger used anymore.
Signed-off-by: Jiro SEKIBA <jir@unicus.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This moves iterator to submit write requests for a series of logs into
segbuf.c, and hides nilfs_segbuf_write() and nilfs_segbuf_wait() in
the file.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This replaces s_dirt flag use in nilfs with a new flag added on the
nilfs object. The s_dirt flag was used to indicate if
sop->write_super() should be called, however the current version of
nilfs does not use the callback. Thus, it can be replaced with the
own flag.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: Jiro SEKIBA <jir@unicus.jp>
This will clean up nilfs_segctor_req struct and the obscure request
argument passed among private methods of segment constructor.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This is a trivial patch to delete unnecessary condition in nilfs_dat_translate.
nilfs_dat_translate() will asign translated address to *blocknrp if blocknrp
is not NULL. However the condition is unneeded, because all callers of
nilfs_dat_translate() pass blocknrp properly.
Signed-off-by: Jiro SEKIBA <jir@unicus.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
nilfs_error() calls nilfs_detach_segment_constructor() if
errors=remount-ro option is specified, and this may lead to a hang due
to recursive locking of, for instance, nilfs->ns_segctor_sem and
others.
In this case, detaching segment constructor is not necessary because
read-only flag is set to the filesystem and further writes are
blocked.
This fixes the potential hang issue by removing the
nilfs_detach_segment_constructor() call from nilfs_error.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
A few nilfs2 ioctls need to ask for and then later release write
access to the mount in order to avoid potential write to read-only
mounts.
This adds the missing mnt_want_write and mnt_drop_write in
nilfs_ioctl_change_cpmode, nilfs_ioctl_delete_checkpoint, and
nilfs_ioctl_clean_segments.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This adds a function to send discard requests for given array of
segment numbers, and calls the function when garbage collection
succeeded.
Signed-off-by: Jiro SEKIBA <jir@unicus.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This fixes incorrect usage of nilfs_segctor_confirm() test function in
nilfs_segctor_destroy(); nilfs_segctor_confirm() returns zero if the
filesystem is not clean, so its use in nilfs_segctor_destroy() needs
inversion.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
The C99 specification states in section 6.11.5:
The placement of a storage-class specifier other than at the beginning
of the declaration specifiers in a declaration is an obsolescent
feature.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This is a trivial style fix patch to mend errors/warnings
reported by "checkpatch.pl --file".
Signed-off-by: Jiro SEKIBA <jir@unicus.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This reverts commit e4c570c4cb, as
requested by Alexey:
"I think I gave a good enough arguments to not merge it.
To iterate:
* patch makes impossible to start using ext3 on EXT3_FS=n kernels
without reboot.
* this is done only for one pointer on task_struct"
None of config options which define task_struct are tristate directly
or effectively."
Requested-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
journal_info in task_struct is used in journaling file system only. So
introduce CONFIG_FS_JOURNAL_INFO and make it conditional.
Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: KONISHI Ryusuke <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This separates wait function for submitted logs from the write
function nilfs_segctor_write(). A new list of segment buffers
"sc_write_logs" is added to hold logs under writing, and double
buffering is partially applied to hide io latency.
At this point, the double buffering is disabled for blocksize <
pagesize because page dirty flag is turned off during write and dirty
buffers are not properly collected for pages crossing over segments.
To receive full benefit of the double buffering, further refinement is
needed to move the io wait outside the lock section of log writer.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This adds a few iterator functions for segment buffers to make it easy
to handle multiple series of logs.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Hides nilfs_write_info struct and nilfs_segbuf_prepare_write function
in segbuf.c to simplify the interface of nilfs_segbuf_write function.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This moves io status variables in nilfs_write_info struct to
nilfs_segment_buffer struct.
This is a preparation to hide nilfs_write_info in segment buffer code.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Previously, log writer had possibility to set an io error flag on
segments even in case of memory allocation failure.
This fixes the issue.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This applies list_splice_tail (or list_splice_tail_init) operation
instead of list_splice (or list_splice_init, respectively) to append a
new list to tail of an existing list.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Replace mark_inode_dirty() as nilfs_mark_inode_dirty()
to reduce deep function calls.
Signed-off-by: Jiro SEKIBA <jir@unicus.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Delete mark_inode_dirty() in nilfs_delete_entry() to reduce duplicate
mark_inode_dirty() calls both in nilfs_rename() and nilfs_delete_entry().
Signed-off-by: Jiro SEKIBA <jir@unicus.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Delete mark_inode_dirty() in nilfs_commit_chunk(), for callers of
nilfs_commit_chunk() will call equivalent mark_inode_dirty()
after calling nilfs_commit_chunk().
Signed-off-by: Jiro SEKIBA <jir@unicus.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
change return type of nilfs_commit_chunk() as void from int,
for nilfs_set_file_dirty() usually does not return error.
This is an intermediate patch to reduce mark_inode_dirty() in
nilfs_commit_chunk().
Signed-off-by: Jiro SEKIBA <jir@unicus.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Split nilfs_unlink() to reduce nested transaction and duplicate
mark_inode_dirty() calls when calling nilfs_unlink() from nilfs_rmdir().
nilfs_do_unlink() is an actual unlink functionality which is not
in transaction and does not call mark_inode_dirty() for dentry argument.
nilfs_unlink() is a wrapper function for do_nilfs_unlink() with
transaction and mark_inode_dirty() for dentry argument.
Signed-off-by: Jiro SEKIBA <jir@unicus.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This is an intermidiate patch to reduce redandunt mark_inode_dirty() calls
by calling inode_inc_link_count() and inode_dec_link_count() functions.
Signed-off-by: Jiro SEKIBA <jir@unicus.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
It is redundant to call mark_inode_dirty() in nilfs_new_inode() because
all caller of nilfs_new_inode() will call mark_inode_dirty()
after calling nilfs_new_inode() directly or indirectly in transaction.
Signed-off-by: Jiro SEKIBA <jir@unicus.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This adds "norecovery" mount option which disables temporal write
access to read-only mounts or snapshots during mount/recovery.
Without this option, write access will be even performed for those
types of mounts; the temporal write access is needed to mount root
file system read-only after an unclean shutdown.
This option will be helpful when user wants to prevent any write
access to the device.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: Eric Sandeen <sandeen@redhat.com>
This adds a helper function, nilfs_valid_fs() which returns if nilfs
is in a valid state or not.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Although mount recovery of nilfs is integrated in load_nilfs()
procedure, the completion of recovery was isolated from the procedure
and performed at the end of the fill_super routine.
This was somewhat confusing since the recovery is needed for the nilfs
object, not for a super block instance.
To resolve the inconsistency, this will integrate the recovery
completion into load_nilfs().
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This inserts readahead in the recovery code. The readahead request is
issued per segment while searching the latest super root block.
This will shorten mount time after unclean unmount. A measurement
shows the recovery time was reduced by more than 60 percent:
e.g. real 0m11.586s -> 0m3.918s (x 2.96)
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This eliminates obsolete nilfs_get_sufile_get_segment_usage() and
nilfs_set_sufile_segment_usage() from sufile.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This adds nilfs_sufile_set_segment_usage() function in sufile to
replace direct access to the sufile metadata in log writer code.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This adds nilfs_sufile_mark_dirty() function in sufile to replace
nilfs_touch_segusage() function in log writer code. This is a
preparation for the further cleanup which will move out low level
sufile operations in the log writer.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This implements cache operation in get block routines of palloc code:
nilfs_palloc_get_desc_block(), nilfs_palloc_get_bitmap_block(), and
nilfs_palloc_get_entry_block().
This will complete the palloc cache.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This adds the palloc cache to ifile. The palloc cache is allocated on
the extended region of nilfs_mdt_info struct. The struct
nilfs_ifile_info defines the extended on memory structure of ifile.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Data pages in gcdat metadata file (i.e. the secondary DAT for GC), are
cleared or even moved back to the normal DAT when a shot of garbage
collection was done.
Buffer heads held by the palloc cache of gcdat must be cleared before
these page cache manipulation. This adds nilfs_palloc_clear_cache()
to ensure this.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This adds the palloc cache to DAT file. The palloc cache is allocated
on the extended region of nilfs_mdt_info struct. The struct
nilfs_dat_info defines the extended on memory structure of DAT.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This adds setup and cleanup routines of the persistent object
allocator cache.
According to ftrace analyses, accessing buffers of the DAT file
suffers indispensable overhead many times. To mitigate the overhead,
This introduce cache framework for the persistent object allocator
(palloc) which the DAT file and ifile are using.
struct nilfs_palloc_cache represents the cache object per metadata
file using palloc.
The cache is initialized through nilfs_palloc_setup_cache() and
destroyed by nilfs_palloc_destroy_cache(); callers of the former
function will be added to individual allocators of DAT and ifile on
successive patches.
nilfs_palloc_destroy_cache() will be called from nilfs_mdt_destroy()
if the cache is attached to a metadata file. A companion function
nilfs_palloc_clear_cache() is provided to allow releasing buffer head
references independently with the cleanup task. This adjunctive
function will be used before invalidating pages of metadata file with
the cache.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This expands a trivial address calculation in the function into its
every callsite. This expansion improves readability of the callers.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This removes the obsolete nilfs_btnode_get() function and makes
nilfs_btree_get_block() directly call nilfs_btnode_submit_block().
This expansion will provide better opportunity for code optimization.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This removes the obsolete argument from nilfs_btnode_submit_block().
This will complete separating a create function of btree node.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This displaces nilfs_btnode_get() use to create new btree node block
with nilfs_btnode_create_block.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Adds a separate routine for creating a btree node block. This is a
preparation to reduce the depth of function calls during submitting
btree node buffer.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>