Minimal supplement to upstream Kernel Self Protection Project changes. Features already provided by SELinux + Yama and archs other than multiarch arm64 / x86_64 aren't in scope. Only tags have stable history. Shared IRC channel with KSPP: irc.libera.chat #linux-hardening
Find a file
Vlastimil Babka d6a24df006 mm, page_alloc: actually ignore mempolicies for high priority allocations
__alloc_pages_slowpath() has for a long time contained code to ignore
node restrictions from memory policies for high priority allocations.
The current code that resets the zonelist iterator however does
effectively nothing after commit 7810e6781e ("mm, page_alloc: do not
break __GFP_THISNODE by zonelist reset") removed a buggy zonelist reset.
Even before that commit, mempolicy restrictions were still not ignored,
as they are passed in ac->nodemask which is untouched by the code.

We can either remove the code, or make it work as intended.  Since
ac->nodemask can be set from task's mempolicy via alloc_pages_current()
and thus also alloc_pages(), it may indeed affect kernel allocations,
and it makes sense to ignore it to allow progress for high priority
allocations.

Thus, this patch resets ac->nodemask to NULL in such cases.  This
assumes all callers can handle it (i.e.  there are no guarantees as in
the case of __GFP_THISNODE) which seems to be the case.  The same
assumption is already present in check_retry_cpuset() for some time.

The expected effect is that high priority kernel allocations in the
context of userspace tasks (e.g.  OOM victims) restricted by mempolicies
will have higher chance to succeed if they are restricted to nodes with
depleted memory, while there are other nodes with free memory left.

It's not a new intention, but for the first time the code will match the
intention, AFAICS.  It was intended by commit 183f6371aa ("mm: ignore
mempolicies when using ALLOC_NO_WATERMARK") in v3.6 but I think it never
really worked, as mempolicy restriction was already encoded in nodemask,
not zonelist, at that time.

So originally that was for ALLOC_NO_WATERMARK only.  Then it was
adjusted by e46e7b77c9 ("mm, page_alloc: recalculate the preferred
zoneref if the context can ignore memory policies") and cd04ae1e2d
("mm, oom: do not rely on TIF_MEMDIE for memory reserves access") to the
current state.  So even GFP_ATOMIC would now ignore mempolicies after
the initial attempts fail - if the code worked as people thought it
does.

Link: http://lkml.kernel.org/r/20180612122624.8045-1-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17 16:20:28 -07:00
arch mm: convert return type of handle_mm_fault() caller to vm_fault_t 2018-08-17 16:20:28 -07:00
block for-4.19/block-20180812 2018-08-14 10:23:25 -07:00
certs Replace magic for trusting the secondary keyring with #define 2018-08-16 09:57:20 -07:00
crypto Replace magic for trusting the secondary keyring with #define 2018-08-16 09:57:20 -07:00
Documentation tools/vm/page-types.c: add support for idle page tracking 2018-08-17 16:20:28 -07:00
drivers mm: convert return type of handle_mm_fault() caller to vm_fault_t 2018-08-17 16:20:28 -07:00
firmware kbuild: remove all dummy assignments to obj- 2017-11-18 11:46:06 +09:00
fs fs/seq_file.c: simplify seq_file iteration code and interface 2018-08-17 16:20:28 -07:00
include vfs: discard ATTR_ATTR_FLAG 2018-08-17 16:20:28 -07:00
init Consolidation of Kconfig files by Christoph Hellwig. 2018-08-15 13:05:12 -07:00
ipc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2018-08-15 15:04:25 -07:00
kernel drm pull for 4.19-rc1 2018-08-15 17:39:07 -07:00
lib SCSI misc on 20180815 2018-08-15 22:06:26 -07:00
LICENSES LICENSES: Add Linux-OpenIB license text 2018-04-27 16:41:53 -06:00
mm mm, page_alloc: actually ignore mempolicies for high priority allocations 2018-08-17 16:20:28 -07:00
net Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2018-08-15 16:01:47 -07:00
samples Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2018-08-15 15:04:25 -07:00
scripts scripts: add Python 3 compatibility to spdxcheck.py 2018-08-17 16:20:27 -07:00
security Merge branch 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security 2018-08-15 22:54:12 -07:00
sound ALSA: update dell-wmi mic-mute registration to new world order 2018-08-15 19:08:10 -07:00
tools tools/vm/page-types.c: add support for idle page tracking 2018-08-17 16:20:28 -07:00
usr kbuild: rename built-in.o to built-in.a 2018-03-26 02:01:19 +09:00
virt arm64 updates for 4.19 2018-08-14 16:39:13 -07:00
.clang-format clang-format: Set IndentWrappedFunctionNames false 2018-08-01 18:38:51 +02:00
.cocciconfig scripts: add Linux .cocciconfig for coccinelle 2016-07-22 12:13:39 +02:00
.get_maintainer.ignore
.gitattributes .gitattributes: set git diff driver for C source code files 2016-10-07 18:46:30 -07:00
.gitignore Kbuild updates for v4.17 (2nd) 2018-04-15 17:21:30 -07:00
.mailmap mailmap: remap some of my email addresses to kernel.org address 2018-08-06 13:15:16 -04:00
COPYING COPYING: use the new text with points to the license files 2018-03-23 12:41:45 -06:00
CREDITS 9p: remove Ron Minnich from MAINTAINERS 2018-08-17 16:20:26 -07:00
Kbuild Kbuild updates for v4.15 2017-11-17 17:45:29 -08:00
Kconfig kconfig: move the "Executable file formats" menu to fs/Kconfig.binfmt 2018-08-02 08:06:55 +09:00
MAINTAINERS 9p: add Dominique Martinet to MAINTAINERS 2018-08-17 16:20:27 -07:00
Makefile Kconfig updates for v4.19 2018-08-15 12:50:10 -07:00
README Docs: Added a pointer to the formatted docs to README 2018-03-21 09:02:53 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.
See Documentation/00-INDEX for a list of what is contained in each file.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.