Commit graph

50052 commits

Author SHA1 Message Date
Chao Yu
c56f16dab0 f2fs: add tracepoint for f2fs_gc
This patch adds tracepoint for f2fs_gc.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-08-21 15:55:05 -07:00
Chao Yu
7f2b4e8ea5 f2fs: retry to revoke atomic commit in -ENOMEM case
During atomic committing, if we encounter -ENOMEM in revoke path, it's
better to give a chance to retry revoking.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-08-21 15:55:04 -07:00
Jaegeuk Kim
afd2b4da40 f2fs: let fill_super handle roll-forward errors
If we set CP_ERROR_FLAG in roll-forward error, f2fs is no longer to proceed
any IOs due to f2fs_cp_error(). But, for example, if some stale data is involved
on roll-forward process, we're able to get -ENOENT, getting fs stuck.
If we get any error, let fill_super set SBI_NEED_FSCK and try to recover back
to stable point.

Cc: <stable@vger.kernel.org>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-08-21 15:55:03 -07:00
Qiuyang Sun
f2220c7f15 f2fs: merge equivalent flags F2FS_GET_BLOCK_[READ|DIO]
Currently, the two flags F2FS_GET_BLOCK_[READ|DIO] are totally equivalent
and can be used interchangably in all scenarios they are involved in.
Neither of the flags is referenced in f2fs_map_blocks(), making them both
the default case. To remove the ambiguity, this patch merges both flags
into F2FS_GET_BLOCK_DEFAULT, and introduces an enum for all distinct flags.

Signed-off-by: Qiuyang Sun <sunqiuyang@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-08-21 15:55:02 -07:00
Chao Yu
4b2414d04e f2fs: support journalled quota
This patch supports to enable f2fs to accept quota information through
mount option:
- {usr,grp,prj}jquota=<quota file path>
- jqfmt=<quota type>

Then, in ->mount flow, we can recover quota file during log replaying,
by this, journelled quota can be supported.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
[Jaegeuk Kim: Fix wrong return values.]
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-08-21 15:54:48 -07:00
Chao Yu
b8c502b81e f2fs: fix potential overflow when adjusting GC cycle
While comparing signed and unsigned variables, compiler will converts the
signed value to unsigned one, due to this reason, {in,de}crease_sleep_time
may return overflowed result.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-08-15 10:40:14 -07:00
Chao Yu
9a20d391cd f2fs: avoid unneeded sync on quota file
We only need to sync quota file with appointed quota type instead of all
types in f2fs_quota_{on,off}.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-08-15 10:40:13 -07:00
Jaegeuk Kim
d9872a698c f2fs: introduce gc_urgent mode for background GC
This patch adds a sysfs entry to control urgent mode for background GC.
If this is set, background GC thread conducts GC with gc_urgent_sleep_time
all the time.

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-08-15 10:40:12 -07:00
Jaegeuk Kim
3537581a72 f2fs: use IPU for cold files
We expect cold files write data sequentially, but sometimes some of small data
can be updated, which incurs fragmentation.
Let's avoid that.

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-08-15 10:40:11 -07:00
Yunlong Song
008396e1b0 f2fs: fix the size value in __check_sit_bitmap
The current size value is not correct and will miss bitmap check.

Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-08-15 10:40:10 -07:00
Chao Yu
b0af6d491a f2fs: add app/fs io stat
This patch enables inner app/fs io stats and introduces below virtual fs
nodes for exposing stats info:
/sys/fs/f2fs/<dev>/iostat_enable
/proc/fs/f2fs/<dev>/iostat_info

Signed-off-by: Chao Yu <yuchao0@huawei.com>
[Jaegeuk Kim: fix wrong stat assignment]
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-08-09 21:43:58 -07:00
Yunlong Song
35ee82ca13 f2fs: do not change the valid_block value if cur_valid_map was wrongly set or cleared
Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-08-09 17:45:23 -07:00
Yunlong Song
6415fedc57 f2fs: update cur_valid_map_mir together with cur_valid_map
When cur_valid_map passes the f2fs_test_and_set(,clear)_bit test,
cur_valid_map_mir update is skipped unlikely, so fix it. The fix
now changes the mirror check together with cur_valid_map all the
time.

Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
[Jaegeuk Kim: Fix unused variable and add unlikely for corner condition.]
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-08-09 17:45:21 -07:00
Jaegeuk Kim
a36c106dff f2fs: use printk_ratelimited for f2fs_msg
This patch reduces contention of printks.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-08-03 19:09:52 -07:00
Jaegeuk Kim
bf9e697ecd f2fs: expose features to sysfs entry
This patch exposes what features are supported by current f2fs build to sysfs
entry via:

/sys/fs/f2fs/features/
/sys/fs/f2fs/dev/features

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-08-03 19:09:51 -07:00
Chao Yu
704956ecf5 f2fs: support inode checksum
This patch adds to support inode checksum in f2fs.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
[Jaegeuk Kim: fix verification flow]
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-08-03 19:09:26 -07:00
Jaegeuk Kim
4f31d26b0c f2fs: return wrong error number on f2fs_quota_write
This must return size, not error number.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-08-03 19:05:05 -07:00
Yunlong Song
401db79f61 f2fs: provide f2fs_balance_fs to __write_node_page
Let node writeback also do f2fs_balance_fs to ensure there are always enough free
segments.

Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-08-03 19:05:04 -07:00
Chao Yu
ddc34e328d f2fs: introduce f2fs_statfs_project
This patch introduces f2fs_statfs_project, it enables to show usage
status of directory tree which is limited with project quota.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-31 16:48:36 -07:00
Chao Yu
2c1d030569 f2fs: support F2FS_IOC_FS{GET,SET}XATTR
This patch adds FS_IOC_FSSETXATTR/FS_IOC_FSGETXATTR ioctl interface
support for f2fs. The interface is kept consistent with the one
of ext4/xfs.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-31 16:48:35 -07:00
Jaegeuk Kim
b6a245eb34 f2fs: don't need to wait for node writes for atomic write
We have a node chain to serialize node block writes, so if any IOs for
node block writes are reordered, we'll get broken node chain. IOWs,
roll-forward recovery will see all or none node blocks given fsync
mark.

E.g.,
Node chain consists of:
 N1 -> N2 -> N3 -> NFSYNC -> N1' -> N2' -> N'FSYNC

Reordered to:
1) N1 -> N2 -> N3 -> N2' -> NFSYNC -> N'FSYNC -> power-cut
2) N1 -> N2 -> N3 -> N1' -> NFSYNC -> power-cut
3) N1 -> N2 -> NFSYNC -> N1' -> N'FSYNC -> N3 -> power-cut
4) N1 -> NFSYNC -> N1' -> N2' -> N'FSYNC -> N3 -> power-cut

Roll-forward recovery can proceed to:
1) N1 -> N2 -> N3 -> NFSYNC -> X
2) N1 -> N2 -> N3 -> NFSYNC -> N1' -> X
3) N1 -> N2 -> N3 -> FSYNC -> N1' -> X
4) N1 -> X

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-31 16:48:34 -07:00
Jaegeuk Kim
dc6b205510 f2fs: avoid naming confusion of sysfs init
This patch changes the function names of sysfs init to follow ext4.

f2fs_init_sysfs <-> f2fs_register_sysfs
f2fs_exit_sysfs <-> f2fs_unregister_sysfs

Suggested-by: Chao Yu <yuchao0@huawei.com>
Reivewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-31 16:48:33 -07:00
Chao Yu
5c57132eaf f2fs: support project quota
This patch adds to support plain project quota.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-31 16:48:32 -07:00
Chao Yu
a6d3a479ae f2fs: record quota during dot{,dot} recovery
In ->lookup(), we will have a try to recover dot or dotdot for
corrupted directory, once disk quota is on, if it allocates new
block during dotdot recovery, we need to record disk quota info
for the allocation, so this patch fixes this issue by adding
missing dquot_initialize() in __recover_dot_dentries.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-31 16:48:31 -07:00
Chao Yu
7a2af766af f2fs: enhance on-disk inode structure scalability
This patch add new flag F2FS_EXTRA_ATTR storing in inode.i_inline
to indicate that on-disk structure of current inode is extended.

In order to extend, we changed the inode structure a bit:

Original one:

struct f2fs_inode {
	...
	struct f2fs_extent i_ext;
	__le32 i_addr[DEF_ADDRS_PER_INODE];
	__le32 i_nid[DEF_NIDS_PER_INODE];
}

Extended one:

struct f2fs_inode {
        ...
        struct f2fs_extent i_ext;
	union {
		struct {
			__le16 i_extra_isize;
			__le16 i_padding;
			__le32 i_extra_end[0];
		};
		__le32 i_addr[DEF_ADDRS_PER_INODE];
	};
        __le32 i_nid[DEF_NIDS_PER_INODE];
}

Once F2FS_EXTRA_ATTR is set, we will steal four bytes in the head of
i_addr field for storing i_extra_isize and i_padding. with i_extra_isize,
we can calculate actual size of reserved space in i_addr, available
attribute fields included in total extra attribute fields for current
inode can be described as below:

  +--------------------+
  | .i_mode            |
  | ...                |
  | .i_ext             |
  +--------------------+
  | .i_extra_isize     |-----+
  | .i_padding         |     |
  | .i_prjid           |     |
  | .i_atime_extra     |     |
  | .i_ctime_extra     |     |
  | .i_mtime_extra     |<----+
  | .i_inode_cs        |<----- store blkaddr/inline from here
  | .i_xattr_cs        |
  | ...                |
  +--------------------+
  |                    |
  |    block address   |
  |                    |
  +--------------------+
  | .i_nid             |
  +--------------------+
  |   node_footer      |
  | (nid, ino, offset) |
  +--------------------+

Hence, with this patch, we would enhance scalability of f2fs inode for
storing more newly added attribute.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-31 16:48:30 -07:00
Chao Yu
f247037120 f2fs: make max inline size changeable
This patch tries to make below macros calculating max inline size,
inline dentry field size considerring reserving size-changeable
space:
- MAX_INLINE_DATA
- NR_INLINE_DENTRY
- INLINE_DENTRY_BITMAP_SIZE
- INLINE_RESERVED_SIZE

Then, when inline_{data,dentry} options is enabled, it allows us to
reserve inline space with different size flexibly for adding newly
introduced inode attribute.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-31 16:48:29 -07:00
Jaegeuk Kim
e65ef20781 f2fs: add ioctl to expose current features
This patch adds an ioctl to provide feature information to user.
For exapmle, SQLite can use this ioctl to detect whether f2fs support atomic
write or not.

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-31 16:48:28 -07:00
Chao Yu
dc6febb6bc f2fs: make background threads of f2fs being aware of freezing
When ->freeze_fs is called from lvm for doing snapshot, it needs to
make sure there will be no more changes in filesystem's data, however,
previously, background threads like GC thread wasn't aware of freezing,
so in environment with active background threads, data of snapshot
becomes unstable.

This patch fixes this issue by adding sb_{start,end}_intwrite in
below background threads:
- GC thread
- flush thread
- discard thread

Note that, don't use sb_start_intwrite() in gc_thread_func() due to:

generic/241 reports below bug:

 ======================================================
 WARNING: possible circular locking dependency detected
 4.13.0-rc1+ #32 Tainted: G           O
 ------------------------------------------------------
 f2fs_gc-250:0/22186 is trying to acquire lock:
  (&sbi->gc_mutex){+.+...}, at: [<f8fa7f0b>] f2fs_sync_fs+0x7b/0x1b0 [f2fs]

 but task is already holding lock:
  (sb_internal#2){++++.-}, at: [<f8fb5609>] gc_thread_func+0x159/0x4a0 [f2fs]

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> #2 (sb_internal#2){++++.-}:
        __lock_acquire+0x405/0x7b0
        lock_acquire+0xae/0x220
        __sb_start_write+0x11d/0x1f0
        f2fs_evict_inode+0x2d6/0x4e0 [f2fs]
        evict+0xa8/0x170
        iput+0x1fb/0x2c0
        f2fs_sync_inode_meta+0x3f/0xf0 [f2fs]
        write_checkpoint+0x1b1/0x750 [f2fs]
        f2fs_sync_fs+0x85/0x1b0 [f2fs]
        f2fs_do_sync_file.isra.24+0x137/0xa30 [f2fs]
        f2fs_sync_file+0x34/0x40 [f2fs]
        vfs_fsync_range+0x4a/0xa0
        do_fsync+0x3c/0x60
        SyS_fdatasync+0x15/0x20
        do_fast_syscall_32+0xa1/0x1b0
        entry_SYSENTER_32+0x4c/0x7b

 -> #1 (&sbi->cp_mutex){+.+...}:
        __lock_acquire+0x405/0x7b0
        lock_acquire+0xae/0x220
        __mutex_lock+0x4f/0x830
        mutex_lock_nested+0x25/0x30
        write_checkpoint+0x2f/0x750 [f2fs]
        f2fs_sync_fs+0x85/0x1b0 [f2fs]
        sync_filesystem+0x67/0x80
        generic_shutdown_super+0x27/0x100
        kill_block_super+0x22/0x50
        kill_f2fs_super+0x3a/0x40 [f2fs]
        deactivate_locked_super+0x3d/0x70
        deactivate_super+0x40/0x60
        cleanup_mnt+0x39/0x70
        __cleanup_mnt+0x10/0x20
        task_work_run+0x69/0x80
        exit_to_usermode_loop+0x57/0x92
        do_fast_syscall_32+0x18c/0x1b0
        entry_SYSENTER_32+0x4c/0x7b

 -> #0 (&sbi->gc_mutex){+.+...}:
        validate_chain.isra.36+0xc50/0xdb0
        __lock_acquire+0x405/0x7b0
        lock_acquire+0xae/0x220
        __mutex_lock+0x4f/0x830
        mutex_lock_nested+0x25/0x30
        f2fs_sync_fs+0x7b/0x1b0 [f2fs]
        f2fs_balance_fs_bg+0xb9/0x200 [f2fs]
        gc_thread_func+0x302/0x4a0 [f2fs]
        kthread+0xe9/0x120
        ret_from_fork+0x19/0x24

 other info that might help us debug this:

 Chain exists of:
   &sbi->gc_mutex --> &sbi->cp_mutex --> sb_internal#2

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(sb_internal#2);
                                lock(&sbi->cp_mutex);
                                lock(sb_internal#2);
   lock(&sbi->gc_mutex);

  *** DEADLOCK ***

 1 lock held by f2fs_gc-250:0/22186:
  #0:  (sb_internal#2){++++.-}, at: [<f8fb5609>] gc_thread_func+0x159/0x4a0 [f2fs]

 stack backtrace:
 CPU: 2 PID: 22186 Comm: f2fs_gc-250:0 Tainted: G           O    4.13.0-rc1+ #32
 Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
 Call Trace:
  dump_stack+0x5f/0x92
  print_circular_bug+0x1b3/0x1bd
  validate_chain.isra.36+0xc50/0xdb0
  ? __this_cpu_preempt_check+0xf/0x20
  __lock_acquire+0x405/0x7b0
  lock_acquire+0xae/0x220
  ? f2fs_sync_fs+0x7b/0x1b0 [f2fs]
  __mutex_lock+0x4f/0x830
  ? f2fs_sync_fs+0x7b/0x1b0 [f2fs]
  mutex_lock_nested+0x25/0x30
  ? f2fs_sync_fs+0x7b/0x1b0 [f2fs]
  f2fs_sync_fs+0x7b/0x1b0 [f2fs]
  f2fs_balance_fs_bg+0xb9/0x200 [f2fs]
  gc_thread_func+0x302/0x4a0 [f2fs]
  ? preempt_schedule_common+0x2f/0x4d
  ? f2fs_gc+0x540/0x540 [f2fs]
  kthread+0xe9/0x120
  ? f2fs_gc+0x540/0x540 [f2fs]
  ? kthread_create_on_node+0x30/0x30
  ret_from_fork+0x19/0x24

The deadlock occurs in below condition:
GC Thread			Thread B
- sb_start_intwrite
				- f2fs_sync_file
				 - f2fs_sync_fs
				  - mutex_lock(&sbi->gc_mutex)
				   - write_checkpoint
				    - block_operations
				     - f2fs_sync_inode_meta
				      - iput
				       - sb_start_intwrite
 - mutex_lock(&sbi->gc_mutex)

Fix this by altering sb_start_intwrite to sb_start_write_trylock.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-31 16:47:27 -07:00
Jaegeuk Kim
7a10f0177e f2fs: don't give partially written atomic data from process crash
This patch resolves the below scenario.

== Process 1 ==     == Process 2 ==
open(w)             open(rw)
begin
write(new_#1)
process_crash
  f_op->flush
  locks_remove_posix
  f_op>release
                    read (new_#1)

In order to avoid corrupted database caused by new_#1, we must do roll-back
at process_crash time. In order to check that, this patch keeps task which
triggers transaction begin, and does roll-back in f_op->flush before removing
file locks.

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-28 17:49:01 -07:00
Jaegeuk Kim
640cc18982 f2fs: give a try to do atomic write in -ENOMEM case
It'd be better to retry writing atomic pages when we get -ENOMEM.

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-28 17:49:00 -07:00
Ernesto A. Fernández
14af20fcb1 f2fs: preserve i_mode if __f2fs_set_acl() fails
When changing a file's acl mask, __f2fs_set_acl() will first set the
group bits of i_mode to the value of the mask, and only then set the
actual extended attribute representing the new acl.

If the second part fails (due to lack of space, for example) and the
file had no acl attribute to begin with, the system will from now on
assume that the mask permission bits are actual group permission bits,
potentially granting access to the wrong users.

Prevent this by only changing the inode mode after the acl has been set.

Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-28 17:48:54 -07:00
Yunlei He
8790568255 f2fs: alloc new nids for xattr block in recovery
recovery file A:			recovery file B:
	-get_dnode_of_data
		-alloc_nid
					-recover_xattr_data
						-set_node_addr(sbi, &ni, NEW_ADDR, false);
							--->bug_on for nid has been used by file A

In recovery process, new allocated node blocks may "reuse" xattr block
nids, this patch alloc new nids for xattr blocks in recovery process to
avoid this problem.

Signed-off-by: Yunlei He <heyunlei@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-26 19:34:30 -07:00
Chao Yu
76a9dd85d4 f2fs: spread struct f2fs_dentry_ptr for inline path
Use f2fs_dentry_ptr structure to indicate inline dentry structure as
much as possible, so we can wrap inline dentry with size-fixed fields
to the one with size-changeable fields. With this change, we can
handle size-changeable inline dentry more easily.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-26 19:34:30 -07:00
Yunlei He
5f4ce6abc2 f2fs: remove unused input parameter
This patch remove unused input parameter in function
new_node_page.

Signed-off-by: Yunlei He <heyunlei@huawei.com>
Signed-off-by: Yong Sheng <shengyong1@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-26 19:34:30 -07:00
Linus Torvalds
25f6a53799 JFS fixes for 4.13
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEIodevzQLVs53l6BhNqiEXrVAjGQFAll3YIwACgkQNqiEXrVA
 jGQE9w//b/MyL9MtvGcKI7u3V7RrqbodzoL42KxV98TI3y7rpmUcmRsqDT045ufD
 S+AOqnIhvH2XbsF7jvZ0UROUtErBgh+pIRqCctVqQM+GKE7p/KR0rY/4eMPlbqQL
 Q2L0ZbokHU4mgvo7SqSJkYFRd0PPBaaw4aJaf8gg1g0pCb29jcer4ycKKHCjz0Wz
 YS2/v7NVWysehihiz/JF6ga5/n6VZeXq3fa48Mmt4TDzKt4IaqgLtAEbPa+eddwW
 z/t4YIoiQ4JUYPnLNmG1Kd8lACfeqKkr2WIh3143ipof4Tj3ZaNc315wVQUl58LP
 6ZUg12tLDVH9OOPdI9VsVLostRbyKH3yUazULBXWIQQt6ZWcFiKFbturMlZevSJB
 OuGBsfpDDvkd+2n2iNcwane/+Ouw4LChzUvmp9/MuIRxinRVdXAVRjaUb2M54xSW
 qR/Yfw8qyky9tac1c1Md/bVo7oMEw0Xliv3HxmGKHNPLMOLzwfVTAw0gGXSrGFn8
 veUKCl1J1+9NWoNExDkyUjsmD1CdkhQk1gpmU70KWQlCgKCtBhWiX6rEy0l8pw1t
 G5UmFpgqG8g61ODuW0dbnSdImmS8mcbY4I1lSvQFb3qVtCeBLz9q7OKSCuavbs4M
 egtaUNVMJ22dVv12loj2OOyVdneUXbUB0WVga3kfEq7QRCtSjF4=
 =fVR9
 -----END PGP SIGNATURE-----

Merge tag 'jfs-4.13' of git://github.com/kleikamp/linux-shaggy

Pull JFS fixes from David Kleikamp.

* tag 'jfs-4.13' of git://github.com/kleikamp/linux-shaggy:
  jfs: preserve i_mode if __jfs_set_acl() fails
  jfs: Don't clear SGID when inheriting ACLs
  jfs: atomically read inode size
2017-07-25 08:51:57 -07:00
Linus Torvalds
505d5c1119 NFS client bugfixes for 4.13
Stable bugfixes:
 - Fix error reporting regression
 
 Bugfixes:
 - Fix setting filelayout ds address race
 - Fix subtle access bug when using ACLs
 - Fix setting mnt3_counts array size
 - Fix a couple of pNFS commit races
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEnZ5MQTpR7cLU7KEp18tUv7ClQOsFAllyVYYACgkQ18tUv7Cl
 QOtTwBAA8ek0Ba5wIfwlQe4MvIX1t6v7q7Otrwdombhuw1a6nW410hFwu2SG8kGd
 fQWr1xnqRsMKwu83UuImtlYYvMa271TZgXxPNWx7KJpi/zocnigJRg5sVpkcRqza
 AjjPc245pHCooQoaTAlm5WrO9zDm0s7lRGuTB1cvPRsElnFVWT0jwhTASIDOU1Zy
 9K/hElAnXp/dZv/ydjSePTPPsVQPJWLbJPlSm6vAIQPyeXWUeCgAym0yf2FVvtsd
 AQeozaq9xq6tofVPZhsYWeBKswjTHs3FxL8vhDOF9gF3QMPm43mwsfrKVidWo4vW
 0UJnRZRCFgG0WIxxhA7l1Z9MovAsXlbWmFufgCa4Ev/bC5WuUT4ZEkjBGJw2vXD4
 0/lxkhD41PBhCl/LIod9OT6iJ8koifl50JUC4N67D2illFy9a7Btzx3EPxfDz9uG
 6jEek9x6B1xf7AC4HhxByN/E8gKX08N4Q/afxTFuAwrzKRKqI4Me3qbCyU86bp+T
 wiAxgVPVnmb/VBVLU68i7titdsA6U8ZO12FFqu9QOr9wHMXxa6108h2Nia9jVTdk
 EZhanXa6tJThQQ/QZicuR5hTtoM4BuikaaJSHZ4ODbgrAjsMAyqy4qsf4tf3LLAo
 tEDu5sJDmHJhhdtqzuz+OtDn5iJ2Nga6a4/fMwt9YU2ZQBm7X/8=
 =ZcQv
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-4.13-2' of git://git.linux-nfs.org/projects/anna/linux-nfs

Pull NFS client bugfixes from Anna Schumaker:
 "Stable bugfix:
   - Fix error reporting regression

  Bugfixes:
   - Fix setting filelayout ds address race
   - Fix subtle access bug when using ACLs
   - Fix setting mnt3_counts array size
   - Fix a couple of pNFS commit races"

* tag 'nfs-for-4.13-2' of git://git.linux-nfs.org/projects/anna/linux-nfs:
  NFS/filelayout: Fix racy setting of fl->dsaddr in filelayout_check_deviceid()
  NFS: Be more careful about mapping file permissions
  NFS: Store the raw NFS access mask in the inode's access cache
  NFSv3: Convert nfs3_proc_access() to use nfs_access_set_mask()
  NFS: Refactor NFS access to kernel access mask calculation
  net/sunrpc/xprt_sock: fix regression in connection error reporting.
  nfs: count correct array for mnt3_counts array size
  Revert commit 722f0b8911 ("pNFS: Don't send COMMITs to the DSes if...")
  pNFS/flexfiles: Handle expired layout segments in ff_layout_initiate_commit()
  NFS: Fix another COMMIT race in pNFS
  NFS: Fix a COMMIT race in pNFS
  mount: copy the port field into the cloned nfs_server structure.
  NFS: Don't run wake_up_bit() when nobody is waiting...
  nfs: add export operations
2017-07-21 16:26:01 -07:00
Linus Torvalds
99313414dd Merge branch 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs
Pull overlayfs fixes from Miklos Szeredi:
 "This fixes a crash with SELinux and several other old and new bugs"

* 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
  ovl: check for bad and whiteout index on lookup
  ovl: do not cleanup directory and whiteout index entries
  ovl: fix xattr get and set with selinux
  ovl: remove unneeded check for IS_ERR()
  ovl: fix origin verification of index dir
  ovl: mark parent impure on ovl_link()
  ovl: fix random return value on mount
2017-07-21 16:24:22 -07:00
Trond Myklebust
1ebf980127 NFS/filelayout: Fix racy setting of fl->dsaddr in filelayout_check_deviceid()
We must set fl->dsaddr once, and once only, even if there are multiple
processes calling filelayout_check_deviceid() for the same layout
segment.

Reported-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-07-21 14:08:45 -04:00
Trond Myklebust
ecbb903c56 NFS: Be more careful about mapping file permissions
When mapping a directory, we want the MAY_WRITE permissions to reflect
whether or not we have permission to modify, add and delete the directory
entries. MAY_EXEC must map to lookup permissions.

On the other hand, for files, we want MAY_WRITE to reflect a permission
to modify and extend the file.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-07-21 11:51:19 -04:00
Trond Myklebust
bd8b244174 NFS: Store the raw NFS access mask in the inode's access cache
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-07-21 11:51:19 -04:00
Trond Myklebust
eda3e20847 NFSv3: Convert nfs3_proc_access() to use nfs_access_set_mask()
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-07-21 11:51:19 -04:00
Trond Myklebust
15d4b73ac2 NFS: Refactor NFS access to kernel access mask calculation
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-07-21 11:51:19 -04:00
Eryu Guan
ecc7b435d2 nfs: count correct array for mnt3_counts array size
Array size of mnt3_counts should be the size of array
mnt3_procedures, not mnt_procedures, though they're same in size
right now. Found this by code inspection.

Fixes: 1c5876ddbd ("sunrpc: move p_count out of struct rpc_procinfo")
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eryu Guan <eguan@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-07-21 08:49:57 -04:00
Linus Torvalds
791f2df39b Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull misc filesystem fixes from Jan Kara:
 "Several ACL related fixes for ext2, reiserfs, and hfsplus.

  And also one minor isofs cleanup"

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  hfsplus: Don't clear SGID when inheriting ACLs
  isofs: Fix off-by-one in 'session' mount option parsing
  reiserfs: preserve i_mode if __reiserfs_set_acl() fails
  ext2: preserve i_mode if ext2_set_acl() fails
  ext2: Don't clear SGID when inheriting ACLs
  reiserfs: Don't clear SGID when inheriting ACLs
2017-07-20 10:41:12 -07:00
Linus Torvalds
465b0dbb38 for-f2fs-v4.13-rc2
We've filed some bug fixes:
 - missing f2fs case in terms of stale SGID big, introduced by Jan
 - build error for seq_file.h
 - avoid cpu lockup
 - wrong inode_unlock in error case
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE00UqedjCtOrGVvQiQBSofoJIUNIFAllwICsACgkQQBSofoJI
 UNItLQ/8CrPqw7pOSoH72n79/d5Md7tKe5TNN2qZbjCVGj7qs2opOnGM8hhFtUTe
 nFzK84evSpIQlgdRJFJU82E55U0coa3ySHgCQSUnHOobTtNsdmwq7p21/xT5LV3s
 211zGYDgqtdp5/5ONHeD1ckF0QR9S9nWPuIRt9ef3bp2c7CfDrk+LLMrwSMeUlZo
 /uk5j32QPdME9ittqZ1bEZPl2FgwgmI4NFjyjGiHDK/ZYGhspHfa7FHjL8PW69UG
 pquiwlqHTg+i9wSc9byYALnJEs1XN6oW8E5TxO5zGqvfa77tQQb+qGHG9kYGDu64
 JMpAXort5ZKNatkLLMXOoojLWutthv70f1IQK3eGUHhiWmsYrWZHjzrDh8hkcgh7
 JMwGbYHrQlsAdk6B1r4MM8GW/telLufM3jTp7Fhpn1fLomWSE28JPtql9Ci5kIKX
 XxUF0y2HbC4ZI5LlY2umRzAfULaEFWEG/8X+wqTl3oE5Jv7Jthd69rpdjJvcQnPx
 iIz7J6BJopjAUoTUlXdSnWkP7VPkDOtDpAiu7cj16U39XSnIW/ceC+qLeP1J2R2c
 +hTg2pfYvh4eJGnNdxv4kZOxFFhjaEBReBPPgYOyCr7IPTtA+sucXO/zqWN6RH95
 tu8+Efl60eQbCt2Gh+JlBR7hXNsgk56ksZ8XaYhBM4VRIWZFc/0=
 =nVpP
 -----END PGP SIGNATURE-----

Merge tag 'for-f2fs-v4.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs

Pull f2fs fixes from Jaegeuk Kim:
 "We've filed some bug fixes:

   - missing f2fs case in terms of stale SGID bit, introduced by Jan

   - build error for seq_file.h

   - avoid cpu lockup

   - wrong inode_unlock in error case"

* tag 'for-f2fs-v4.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs:
  f2fs: avoid cpu lockup
  f2fs: include seq_file.h for sysfs.c
  f2fs: Don't clear SGID when inheriting ACLs
  f2fs: remove extra inode_unlock() in error path
2017-07-20 10:30:16 -07:00
Amir Goldstein
0e082555ce ovl: check for bad and whiteout index on lookup
Index should always be of the same file type as origin, except for
the case of a whiteout index.  A whiteout index should only exist
if all lower aliases have been unlinked, which means that finding
a lower origin on lookup whose index is a whiteout should be treated
as a lookup error.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-07-20 11:08:21 +02:00
Amir Goldstein
61b674710c ovl: do not cleanup directory and whiteout index entries
Directory index entries are going to be used for looking up
redirected upper dirs by lower dir fh when decoding an overlay
file handle of a merge dir.

Whiteout index entries are going to be used as an indication that
an exported overlay file handle should be treated as stale (i.e.
after unlink of the overlay inode).

We don't know the verification rules for directory and whiteout
index entries, because they have not been implemented yet, so fail
to mount overlay rw if those entries are found to avoid corrupting
an index that was created by a newer kernel.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-07-20 11:08:21 +02:00
Miklos Szeredi
1d88f18373 ovl: fix xattr get and set with selinux
inode_doinit_with_dentry() in SELinux wants to read the upper inode's xattr
to get security label, and ovl_xattr_get() calls ovl_dentry_real(), which
depends on dentry->d_inode, but d_inode is null and not initialized yet at
this point resulting in an Oops.

Fix by getting the upperdentry info from the inode directly in this case.

Reported-by: Eryu Guan <eguan@redhat.com>
Fixes: 09d8b58673 ("ovl: move __upperdentry to ovl_inode")
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-07-20 11:08:21 +02:00
Trond Myklebust
213297369c Revert commit 722f0b8911 ("pNFS: Don't send COMMITs to the DSes if...")
Doing the test without taking any locks is racy, and so really it makes
more sense to do it in the flexfiles code (which is the only case that
cares).

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-07-19 15:28:21 -04:00
Trond Myklebust
4b75053e9b pNFS/flexfiles: Handle expired layout segments in ff_layout_initiate_commit()
If the layout has expired due to a fencing event, then we should not
attempt to commit to the DS.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-07-19 15:28:21 -04:00