Commit graph

57210 commits

Author SHA1 Message Date
Josef Bacik
119e80df7d btrfs: call btrfs_create_pending_block_groups unconditionally
The first thing we do is loop through the list, this

if (!list_empty())
	btrfs_create_pending_block_groups();

thing is just wasted space.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:13:25 +01:00
Josef Bacik
fa781cea3d btrfs: make btrfs_destroy_delayed_refs use btrfs_delete_ref_head
Instead of open coding this stuff use the helper instead.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:13:25 +01:00
Josef Bacik
3069bd2669 btrfs: make btrfs_destroy_delayed_refs use btrfs_delayed_ref_lock
We have this open coded in btrfs_destroy_delayed_refs, use the helper
instead.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:13:25 +01:00
Anand Jain
d1e1442065 btrfs: scrub: print messages when started or finished
The kernel log messages help debugging and audit, add them for scrub

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:13:24 +01:00
David Sterba
ce3ded1061 btrfs: simplify workqueue name when allocating
The workqueue name is constructed from a format string but the prefix
does not need to be set by %s.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:13:24 +01:00
Anand Jain
09ba3bc9dd btrfs: merge btrfs_find_device and find_device
Both btrfs_find_device() and find_device() does the same thing except
that the latter does not take the seed device onto account in the device
scanning context. We can merge them.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:13:24 +01:00
Anand Jain
70bc7088aa btrfs: refactor btrfs_free_stale_devices() to get return value
Preparatory patch to add ioctl that allows to forget a device (ie.
reverse of scan).

Refactors btrfs_free_stale_devices() to obtain return status. As this
function can fail if it can't find the given path (returns -ENOENT) or
trying to delete a mounted device (returns -EBUSY).

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:13:23 +01:00
Anand Jain
e4319cd9ca btrfs: refactor btrfs_find_device() take fs_devices as argument
btrfs_find_device() accepts fs_info as an argument and retrieves
fs_devices from fs_info.

Instead use fs_devices, so that this function can be used in non-mount
(during device scanning) context as well.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:13:23 +01:00
Anand Jain
6e927cebe2 btrfs: cleanup btrfs_find_device_by_devspec()
btrfs_find_device_by_devspec() finds the device by @devid or by
@device_path. This patch makes code flow easy to read by open coding the
else part and renames devpath to device_path.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:13:23 +01:00
Anand Jain
d95a830c78 btrfs: merge btrfs_find_device_missing_or_by_path() into parent
btrfs_find_device_missing_or_by_path() is relatively small function, and
its only parent btrfs_find_device_by_devspec() is small as well. Besides
there are a number of find_device functions. Merge
btrfs_find_device_missing_or_by_path() into its parent.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:13:22 +01:00
Nikolay Borisov
02a033df7a btrfs: Remove not_found_em label from btrfs_get_extent
In order to avoid duplicating init code for em there is an additional
label, not_found_em, which is used to only set ->block_start. The only
case when it will be used is if the extent we are adding overlaps with
an existing extent. Make that case more obvious by:

 1. Adding a comment hinting at what's going on
 2. Assigning EXTENT_MAP_HOLE and directly going to insert.

 No functional changes.

Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:13:22 +01:00
Nikolay Borisov
b8eeab7fce btrfs: Consolidate retval checking of core btree functions
Core btree functions in btrfs generally return 0 when an item is found,
1 in case the sought item cannot be found and <0 when an error happens.
Consolidate the checks for those conditions in one 'if () {} else if ()
{}' construct rather than 2 separate 'if () {}' statements. This
emphasizes that the handling code pertains to a single function. No
functional changes.

Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:13:22 +01:00
Nikolay Borisov
694c12ed9d btrfs: Rename found_type to extent_type in btrfs_get_extent
found_type really holds the type of extent and is guaranteed to to have
a value between [0, 2]. The only time it can contain anything different
is if btrfs_lookup_file_extent returned a positive value and the
previous item is different than an extent. Avoid this situation by
simply checking found_key.type rather than assigning the item type to
found_type intermittently. Also make the variable an u8 to reduce stack
usage. No functional changes.

Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:13:22 +01:00
Filipe Manana
500710d3b8 Btrfs: move duplicated nodatasum check into common reflink/dedupe helper
Move the check that verifies if both inodes have checksums disabled or
both have them enabled, from the clone and deduplication functions into
the new common helper btrfs_remap_file_range_prep().

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:13:21 +01:00
Nikolay Borisov
951e05a904 btrfs: Remove impossible condition from mergable_maps
We can never have extents marked as EXTENT_MAP_DELALLOC since this
value is only ever used by btrfs_get_extent_fiemap. In this case the
extent map is created by btrfs_get_extent_fiemap and is never really
published, this flag is used to return the corresponding userspace one.
Considering this, it's pointless having a check for EXTENT_MAP_DELALLOC
in mergable_maps. Just remove it.

Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:13:21 +01:00
Filipe Manana
d00c2d9c76 Btrfs: do not overwrite error return value in the balance ioctl
If the call to btrfs_balance() failed we would overwrite the error
returned to user space with -EFAULT if the call to copy_to_user() failed
as well. Fix that by calling copy_to_user() only if btrfs_balance()
returned success or was canceled.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:13:21 +01:00
Filipe Manana
d3a53286c1 Btrfs: do not overwrite error return value in the device replace ioctl
If the call to btrfs_dev_replace_by_ioctl() failed we would overwrite the
error returned to user space with -EFAULT if the call to copy_to_user()
failed as well. Fix that by calling copy_to_user() only if no error
happened before or a device replace operation was canceled.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:13:20 +01:00
Filipe Manana
0f39b60563 Btrfs: remove redundant check for swapfiles when reflinking
Checking if either of the inodes corresponds to a swapfile is already
performed by generic_remap_file_range_prep(), so we do not need to do
it in the btrfs clone and deduplication functions.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:13:20 +01:00
Nikolay Borisov
420829d8ea btrfs: Refactor shrink_delalloc
Add a couple of comments regarding the logic flow in shrink_delalloc.
Then, cease using max_reclaim as a temporary variable when calculating
nr_pages. Finally give max_reclaim a more becoming name, which
uneqivocally shows at what this variable really holds. No functional
changes.

Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:13:20 +01:00
Nikolay Borisov
4546d17874 btrfs: Document logic regarding inode in async_cow_submit
Add a comment explaining when ->inode could be NULL and why we always
perform the ->async_delalloc_pages modification.

Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:13:20 +01:00
Nikolay Borisov
a1d64ba609 btrfs: Remove WARN_ON in btrfs_alloc_delalloc_work
It can never trigger since before calling alloc_delalloc_work we have
called igrab in start_delalloc_inodes.

Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:13:19 +01:00
Nikolay Borisov
bd4691a0e8 btrfs: Use ihold instead of igrab in cow_file_range_async
ihold is supposed to be used when the caller already has a reference to
the inode. In the case of cow_file_range_async this invariants holds,
since the 3 call chains leading to this function all take a reference:

btrfs_writepage  <--- does igrab
 extent_write_full_page
  __extent_writepage
   writepage_delalloc
     btrfs_run_delalloc_range
      cow_file_range_async

extent_write_cache_pages <--- does igrab
 __extent_writepage (same callchain as above)

and

submit_compressed_extents <-- already called from async CoW submit path,
			      which would have done ihold.
 extent_write_locked_range
  __extent_writepage

Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ add comment ]
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:13:19 +01:00
Nikolay Borisov
62b3762271 btrfs: Remove isize local variable in compress_file_range
It's used only once so just inline the call to i_size_read. The
semantics regarding the inode size are not changed, the pages in the
range are locked and i_size cannot change between the time it was set
and used.

Reviewed-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:13:19 +01:00
Nikolay Borisov
532425ff9e btrfs: Remove inode argument from async_cow_submit
We already pass the async_cow struct that holds a reference to the
inode. Exploit this fact and remove the extra inode argument. No
functional changes.

Reviewed-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:13:18 +01:00
YueHaibing
aa704d4e75 btrfs: remove set but not used variable 'num_pages'
Fixes gcc '-Wunused-but-set-variable' warning:

fs/btrfs/ioctl.c: In function 'btrfs_extent_same':
fs/btrfs/ioctl.c:3260:6: warning:
 variable 'num_pages' set but not used [-Wunused-but-set-variable]

It not used any more since commit 9ee8234e6220 ("Btrfs: use
generic_remap_file_range_prep() for cloning and deduplication")

Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:13:18 +01:00
Nikolay Borisov
02950af4e3 btrfs: Remove redundant assignment in btrfs_get_extent_fiemap
hole_len is only used if the hole falls within the requested range. Make
that explicitly clear by only assigning in the corresponding branch.

Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:13:18 +01:00
Nikolay Borisov
f3714ef479 btrfs: Refactor btrfs_get_extent_fiemap
Make btrfs_get_extent_fiemap a bit more friendly. First step is to
rename the closely related, yet arbitrary named
range_start/found_end/found variables. They define the delalloc range
that is found in case a real extent wasn't found. Subsequently remove
an unnecessary check for hole_em since it's guaranteed to be set i.e the
check is always true. Top it off by giving all comments a refresh.

No functional changes.

Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ reformatted a few more comments ]
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:13:18 +01:00
Nikolay Borisov
4ab47a8d9c btrfs: Remove unused arguments from btrfs_get_extent_fiemap
This function is a simple wrapper over btrfs_get_extent that returns
either:

a) A real extent in the passed range or
b) Adjusted extent based on whether delalloc bytes are found backing up
   a hole.

To support these semantics it doesn't need the page/pg_offset/create
arguments which are passed to btrfs_get_extent in case an extent is to
be created. So simplify the function by removing the unused arguments.
No functional changes.

Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:13:17 +01:00
Filipe Manana
a087349066 Btrfs: setup a nofs context for memory allocation at __btrfs_set_acl
We are holding a transaction handle when setting an acl, therefore we can
not allocate the xattr value buffer using GFP_KERNEL, as we could deadlock
if reclaim is triggered by the allocation, therefore setup a nofs context.

Fixes: 39a27ec100 ("btrfs: use GFP_KERNEL for xattr and acl allocations")
CC: stable@vger.kernel.org # 4.9+
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:13:17 +01:00
Filipe Manana
b89f6d1fcb Btrfs: setup a nofs context for memory allocation at btrfs_create_tree()
We are holding a transaction handle when creating a tree, therefore we can
not allocate the root using GFP_KERNEL, as we could deadlock if reclaim is
triggered by the allocation, therefore setup a nofs context.

Fixes: 74e4d82757 ("btrfs: let callers of btrfs_alloc_root pass gfp flags")
CC: stable@vger.kernel.org # 4.9+
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:13:16 +01:00
Filipe Manana
eee9957754 Btrfs: do not overwrite error return value in the get device stats ioctl
If the call to btrfs_get_dev_stats() failed we would overwrite the error
returned to user space with -EFAULT if the call to copy_to_user() failed
as well. Fix that by calling copy_to_user() only if btrfs_get_dev_stats()
returned success.

Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:13:16 +01:00
Filipe Manana
4fa99b008f Btrfs: do not overwrite error return value in scrub progress ioctl
If the call to btrfs_scrub_progress() failed we would overwrite the error
returned to user space with -EFAULT if the call to copy_to_user() failed
as well. Fix that by calling copy_to_user() only if btrfs_scrub_progress()
returned success.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:13:16 +01:00
Filipe Manana
06fe39ab15 Btrfs: do not overwrite scrub error with fault error in scrub ioctl
If scrub returned an error and then the copy_to_user() call did not
succeed, we would overwrite the error returned by scrub with -EFAULT.
Fix that by calling copy_to_user() only if btrfs_scrub_dev() returned
success.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:13:15 +01:00
Nikolay Borisov
bc9a8bf79c btrfs: Make first argument of btrfs_run_delalloc_range directly an inode
Since this function is no longer a callback there is no need to have
its first argument obfuscated with a void *. Change it directly to a
pointer to an inode. No functional changes.

Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:13:15 +01:00
Julia Lawall
9cf10cc195 Btrfs: drop useless LIST_HEAD in merge_reloc_root
Drop LIST_HEAD where the variable it declares is never used.

The uses were removed in 3fd0a5585e ("Btrfs: Metadata ENOSPC
handling for balance"), but not the declaration.

The semantic patch that fixes this problem is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@
identifier x;
@@
- LIST_HEAD(x);
  ... when != x
// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:13:15 +01:00
8a61716ff2 Two bug fixes for old issues, both marked for stable.
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCAAxFiEEydHwtzie9C7TfviiSn/eOAIR84sFAlxu23wTHGlkcnlvbW92
 QGdtYWlsLmNvbQAKCRBKf944AhHzi04sB/9/Lm694io6coIW/8mTVjKs9iF7+AkA
 V11Pu3VHnp8okcNs3iSv6rtbPgLrONP5fg7OC9xAzzsEjA8kitnV7kC0RunzuUiw
 BnZcE0v5Eh4NB34DW4w4I0JDkEH75zlsw/ar8HDvgxTw9hCgV56n1oic+MkqAuiL
 5TwBjuLL5Nl4tA9djcMtHhBy3u+jQXAEDJ9tiKFBhzQOPo73N2IyZyY+Kz3qqygV
 XLWvpmW6PA9jJt8pUGRCqpahaYAY7+cyzOUJbqdRUpeESVl/2Epr7zJZo9LHQtsE
 GympS5weYMg6xLoZHQXvLbb77E/SlXkeVyNzBw7LnV6us2PS4tbgSt2J
 =alOG
 -----END PGP SIGNATURE-----

Merge tag 'ceph-for-5.0-rc8' of git://github.com/ceph/ceph-client

Pull ceph fixes from Ilya Dryomov:
 "Two bug fixes for old issues, both marked for stable"

* tag 'ceph-for-5.0-rc8' of git://github.com/ceph/ceph-client:
  ceph: avoid repeatedly adding inode to mdsc->snap_flush_list
  libceph: handle an empty authorize reply
2019-02-21 09:43:37 -08:00
Michal Hocko
b2b469939e proc, oom: do not report alien mms when setting oom_score_adj
Tetsuo has reported that creating a thousands of processes sharing MM
without SIGHAND (aka alien threads) and setting
/proc/<pid>/oom_score_adj will swamp the kernel log and takes ages [1]
to finish.  This is especially worrisome that all that printing is done
under RCU lock and this can potentially trigger RCU stall or softlockup
detector.

The primary reason for the printk was to catch potential users who might
depend on the behavior prior to 44a70adec9 ("mm, oom_adj: make sure
processes sharing mm have same view of oom_score_adj") but after more
than 2 years without a single report I guess it is safe to simply remove
the printk altogether.

The next step should be moving oom_score_adj over to the mm struct and
remove all the tasks crawling as suggested by [2]

[1] http://lkml.kernel.org/r/97fce864-6f75-bca5-14bc-12c9f890e740@i-love.sakura.ne.jp
[2] http://lkml.kernel.org/r/20190117155159.GA4087@dhcp22.suse.cz

Link: http://lkml.kernel.org/r/20190212102129.26288-1-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Yong-Taek Lee <ytk.lee@samsung.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-02-21 09:01:00 -08:00
1f5a018c5b Merge branch 'fixes-v5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull keys fixes from James Morris:

 - Handle quotas better, allowing full quota to be reached.

 - Fix the creation of shortcuts in the assoc_array internal
   representation when the index key needs to be an exact multiple of
   the machine word size.

 - Fix a dependency loop between the request_key contruction record and
   the request_key authentication key. The construction record isn't
   really necessary and can be dispensed with.

 - Set the timestamp on a new key rather than leaving it as 0. This
   would ordinarily be fine - provided the system clock is never set to
   a time before 1970

* 'fixes-v5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  keys: Timestamp new keys
  keys: Fix dependency loop between construction record and auth key
  assoc_array: Fix shortcut creation
  KEYS: allow reaching the keys quotas exactly
2019-02-20 09:09:33 -08:00
Kees Cook
b5372fe5dc exec: load_script: Do not exec truncated interpreter path
Commit 8099b047ec ("exec: load_script: don't blindly truncate
shebang string") was trying to protect against a confused exec of a
truncated interpreter path. However, it was overeager and also refused
to truncate arguments as well, which broke userspace, and it was
reverted. This attempts the protection again, but allows arguments to
remain truncated. In an effort to improve readability, helper functions
and comments have been added.

Co-developed-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Samuel Dionne-Riel <samuel@dionne-riel.com>
Cc: Richard Weinberger <richard.weinberger@gmail.com>
Cc: Graham Christensen <graham@grahamc.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-02-18 16:49:36 -08:00
Yan, Zheng
04242ff3ac ceph: avoid repeatedly adding inode to mdsc->snap_flush_list
Otherwise, mdsc->snap_flush_list may get corrupted.

Cc: stable@vger.kernel.org
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2019-02-18 18:08:29 +01:00
88fe73cb80 Two small fixes, one for crashes using nfs/krb5 with older enctypes, one
that could prevent clients from reclaiming state after a kernel upgrade.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJcZzZHAAoJECebzXlCjuG+EOsQALVuwSJqQh4GUVMSBYzL6Ov4
 SfinB8LJ8/1HwngSvRB3xQ4HiOtpFSNkjzfFYE7epy6augY8tRRnHGbnlHbsG5vI
 wQqTR6PbSq2mupgpi2WGRlRh521SDOi8V49fplUC+FuV7dJT/wm0hgdKsHCPHPX4
 TEYPglsvG6PLu5IcAofNac9PVZH21s3yVIKvqd6yifED5lhopdNw210s5DtzvugI
 g2JgHOhTfana+xQS/cJ1U8JHbbpM7jwOXAJ7IWD8k4GXdAW03X6jNOcseudcBTQY
 qSL33//6Xdu0r0uI21z4ZWxSWCOtt8YvnbMoG4EBqh3DpKbUpExh8j4eIyNPSuSF
 Y/8iAVJ9KWYhWO+IVPqvHVXz4mCIDK+f7iJ/m+lLjOQmWkpp6koeUDjKs4k9zBUC
 mbGTOrh0TJzXvKWKEU5Qy7meZVJGUpV+9ca+cDs5XN7Xa3blTp+5VrRVeDgKO5Kx
 OF3Y3IBOWhqN7+kEH98RvdZAmtbO0zg02IEIHOMPxH69JU8o0EsEni1LXsqDJrRi
 sLVYXvLwdPLfkqSjpI8xNeaoFXeelopx8Re+2oNEFIEvsfeT5XikbQHoqgFJNsyk
 hz7PHwuyGjc6NJRRSBUKYouWKPP4rrM7ZiOSyIEDYIIwyhBirpjrECaHzdi3D5j+
 xUyFGMF5F3wk1fdQHPfD
 =NopI
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-5.0-2' of git://linux-nfs.org/~bfields/linux

Pull more nfsd fixes from Bruce Fields:
 "Two small fixes, one for crashes using nfs/krb5 with older enctypes,
  one that could prevent clients from reclaiming state after a kernel
  upgrade"

* tag 'nfsd-5.0-2' of git://linux-nfs.org/~bfields/linux:
  sunrpc: fix 4 more call sites that were using stack memory with a scatterlist
  Revert "nfsd4: return default lease period"
2019-02-16 17:38:01 -08:00
55638c520b More NFS client fixes for Linux 5.0
- Make sure Send CQ is allocated on an existing compvec
 - Properly check debugfs dentry before using it
 - Don't use page_file_mapping() after removing a page
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEnZ5MQTpR7cLU7KEp18tUv7ClQOsFAlxnMQ0ACgkQ18tUv7Cl
 QOsbhQ//VhgoXX25xHrApLz8wMuYPNOboDFSUf0O1GWoHi3opHnP+9LPf/iZkRQy
 YS0ufcO95i1LGjZLb8ac9hBWkko8TBl/dIONsG4ppf2bAbiVuag848wehi8hsGba
 zaSsXV6qdibq4qZsyK35hh0cHVHDgB1EMTu7AVORdvXsTHVX3xL86vts2y2VSLKv
 w9yKQBg4E4pWwENi7v77icSuGg/WpwfKnYxBzG6JPXuHQLGidyc/HrnVmLwhd6DQ
 0Sa6nzOAvgjjgVibB+tJfsitScmMTsaxulvHsm5iLjPJZ8SUjxYvAPl3AZdCYPvU
 XaADy8nrvXJUe9APhMINbkoxnF4W/OPnUMG3bWkWp2LeNZvk5l7VOzTW5Sh49Xyk
 pBAOd7qr3kfjFdvzypVz9NeXuS6BsTUA6LAudo8rF7nxi8jHPp6L+zZNWVrPIjY0
 +bNIj3K1Bji3jU9vTHyTzxDRB/4ZnzJaPF2Gv/5Y2cvkI7mfzHUz5p6cAU1OPIVB
 kuhZXkQFEPSS2OV6MUOe/HgmtY0oLM3XU9cEaFkLz59D1kb1fjO/yUu9YBQMq6Ke
 o6b7Dwh4WvLVN/AbgegKOnp5G0/ljmz6y7ML0AElYXg1iT4k0zE+qJpMWhOTRJnd
 +jf4hSS+l7p7D1ed+uqdMS/jc1s5vcuxwYDQUIutELjA/TCbLNI=
 =28v+
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-5.0-4' of git://git.linux-nfs.org/projects/anna/linux-nfs

Pull more NFS client fixes from Anna Schumaker:
 "Three fixes this time.

  Nicolas's is for xprtrdma completion vector allocation on single-core
  systems. Greg's adds an error check when allocating a debugfs dentry.
  And Ben's is an additional fix for nfs_page_async_flush() to prevent
  pages from accidentally getting truncated.

  Summary:

   - Make sure Send CQ is allocated on an existing compvec

   - Properly check debugfs dentry before using it

   - Don't use page_file_mapping() after removing a page"

* tag 'nfs-for-5.0-4' of git://git.linux-nfs.org/projects/anna/linux-nfs:
  NFS: Don't use page_file_mapping after removing the page
  rpc: properly check debugfs dentry before using it
  xprtrdma: Make sure Send CQ is allocated on an existing compvec
2019-02-16 17:33:39 -08:00
David Howells
822ad64d7e keys: Fix dependency loop between construction record and auth key
In the request_key() upcall mechanism there's a dependency loop by which if
a key type driver overrides the ->request_key hook and the userspace side
manages to lose the authorisation key, the auth key and the internal
construction record (struct key_construction) can keep each other pinned.

Fix this by the following changes:

 (1) Killing off the construction record and using the auth key instead.

 (2) Including the operation name in the auth key payload and making the
     payload available outside of security/keys/.

 (3) The ->request_key hook is given the authkey instead of the cons
     record and operation name.

Changes (2) and (3) allow the auth key to naturally be cleaned up if the
keyring it is in is destroyed or cleared or the auth key is unlinked.

Fixes: 7ee02a316600 ("keys: Fix dependency loop between construction record and auth key")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.morris@microsoft.com>
2019-02-15 14:12:09 -08:00
cb5b020a8d Revert "exec: load_script: don't blindly truncate shebang string"
This reverts commit 8099b047ec.

It turns out that people do actually depend on the shebang string being
truncated, and on the fact that an interpreter (like perl) will often
just re-interpret it entirely to get the full argument list.

Reported-by: Samuel Dionne-Riel <samuel@dionne-riel.com>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-02-14 15:02:18 -08:00
Bob Peterson
23e93c9b2c Revert "gfs2: read journal in large chunks to locate the head"
This reverts commit 2a5f14f279.

This patch causes xfstests generic/311 to fail. Reverting this for
now until we have a proper fix.

Signed-off-by: Abhi Das <adas@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-02-14 09:52:51 -08:00
J. Bruce Fields
3bf6b57ec2 Revert "nfsd4: return default lease period"
This reverts commit d6ebf5088f.

I forgot that the kernel's default lease period should never be
decreased!

After a kernel upgrade, the kernel has no way of knowing on its own what
the previous lease time was.  Unless userspace tells it otherwise, it
will assume the previous lease period was the same.

So if we decrease this value in a kernel upgrade, we end up enforcing a
grace period that's too short, and clients will fail to reclaim state in
time.  Symptoms may include EIO and log messages like "NFS:
nfs4_reclaim_open_state: Lock reclaim failed!"

There was no real justification for the lease period decrease anyway.

Reported-by: Donald Buczek <buczek@molgen.mpg.de>
Fixes: d6ebf5088f "nfsd4: return default lease period"
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-02-14 12:33:19 -05:00
1f947a7a01 Merge branch 'akpm' (patches from Andrew)
Merge fixes from Andrew Morton:
 "6 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm: proc: smaps_rollup: fix pss_locked calculation
  Rename include/{uapi => }/asm-generic/shmparam.h really
  Revert "mm: use early_pfn_to_nid in page_ext_init"
  mm/gup: fix gup_pmd_range() for dax
  Revert "mm: slowly shrink slabs with a relatively small number of objects"
  Revert "mm: don't reclaim inodes with many attached pages"
2019-02-12 17:15:33 -08:00
Sandeep Patil
27dd768ed8 mm: proc: smaps_rollup: fix pss_locked calculation
The 'pss_locked' field of smaps_rollup was being calculated incorrectly.
It accumulated the current pss everytime a locked VMA was found.  Fix
that by adding to 'pss_locked' the same time as that of 'pss' if the vma
being walked is locked.

Link: http://lkml.kernel.org/r/20190203065425.14650-1-sspatil@android.com
Fixes: 493b0e9d94 ("mm: add /proc/pid/smaps_rollup")
Signed-off-by: Sandeep Patil <sspatil@android.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Daniel Colascione <dancol@google.com>
Cc: <stable@vger.kernel.org>	[4.14.x, 4.19.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-02-12 16:33:18 -08:00
Dave Chinner
69056ee6a8 Revert "mm: don't reclaim inodes with many attached pages"
This reverts commit a76cf1a474 ("mm: don't reclaim inodes with many
attached pages").

This change causes serious changes to page cache and inode cache
behaviour and balance, resulting in major performance regressions when
combining worklaods such as large file copies and kernel compiles.

  https://bugzilla.kernel.org/show_bug.cgi?id=202441

This change is a hack to work around the problems introduced by changing
how agressive shrinkers are on small caches in commit 172b06c32b ("mm:
slowly shrink slabs with a relatively small number of objects").  It
creates more problems than it solves, wasn't adequately reviewed or
tested, so it needs to be reverted.

Link: http://lkml.kernel.org/r/20190130041707.27750-2-david@fromorbit.com
Fixes: a76cf1a474 ("mm: don't reclaim inodes with many attached pages")
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Cc: Wolfgang Walter <linux@stwm.de>
Cc: Roman Gushchin <guro@fb.com>
Cc: Spock <dairinin@gmail.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-02-12 16:33:18 -08:00
Benjamin Coddington
d2ceb7e570 NFS: Don't use page_file_mapping after removing the page
If nfs_page_async_flush() removes the page from the mapping, then we can't
use page_file_mapping() on it as nfs_updatepate() is wont to do when
receiving an error.  Instead, push the mapping to the stack before the page
is possibly truncated.

Fixes: 8fc75bed96 ("NFS: Fix up return value on fatal errors in nfs_page_async_flush()")
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-02-12 15:56:28 -05:00