linux-hardened/include
Mel Gorman a368ab67aa mm: move zone lock to a different cache line than order-0 free page lists
Huang Ying reported the following problem due to commit 3484b2de94 ("mm:
rearrange zone fields into read-only, page alloc, statistics and page
reclaim lines") from the Intel performance tests

    24b7e5819a  3484b2de94
    ----------------  --------------------------
             %stddev     %change         %stddev
                 \          |                \
        152288 \261  0%     -46.2%      81911 \261  0%  aim7.jobs-per-min
           237 \261  0%     +85.6%        440 \261  0%  aim7.time.elapsed_time
           237 \261  0%     +85.6%        440 \261  0%  aim7.time.elapsed_time.max
         25026 \261  0%     +70.7%      42712 \261  0%  aim7.time.system_time
       2186645 \261  5%     +32.0%    2885949 \261  4%  aim7.time.voluntary_context_switches
       4576561 \261  1%     +24.9%    5715773 \261  0%  aim7.time.involuntary_context_switches

The problem is specific to very large machines under stress.  It was not
reproducible with the machines I had used to justify the original patch
because large numbers of CPUs are required.  When pressure is high enough,
the cache line is bouncing between CPUs trying to acquire the lock and the
holder of the lock adjusting free lists.  The intention was that the
acquirer of the lock would automatically have the cache line holding the
free lists but according to Huang, this is not a universal win.

One possibility is to move the zone lock to its own cache line but it
increases the size of the zone.  This patch moves the lock to the other
end of the free lists where they do not contend under high pressure.  It
does mean the page allocator paths now require more cache lines but Huang
reports that it restores performance to previous levels on large machines

             %stddev     %change         %stddev
                 \          |                \
         84568 \261  1%     +94.3%     164280 \261  1%  aim7.jobs-per-min
       2881944 \261  2%     -35.1%    1870386 \261  8%  aim7.time.voluntary_context_switches
           681 \261  1%      -3.4%        658 \261  0%  aim7.time.user_time
       5538139 \261  0%     -12.1%    4867884 \261  0%  aim7.time.involuntary_context_switches
         44174 \261  1%     -46.0%      23848 \261  1%  aim7.time.system_time
           426 \261  1%     -48.4%        219 \261  1%  aim7.time.elapsed_time
           426 \261  1%     -48.4%        219 \261  1%  aim7.time.elapsed_time.max
           468 \261  1%     -43.1%        266 \261  2%  uptime.boot

Signed-off-by: Mel Gorman <mgorman@suse.de>
Reported-by: Huang Ying <ying.huang@intel.com>
Tested-by: Huang Ying <ying.huang@intel.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-07 16:45:33 -07:00
..
acpi Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux 2015-02-19 11:28:36 -08:00
asm-generic OK, this has the big virtio 1.0 implementation, as specified by OASIS. 2015-02-18 09:24:01 -08:00
clocksource
crypto Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2015-02-14 09:47:01 -08:00
drm drm/ttm: device address space != CPU address space 2015-03-05 09:04:39 +10:00
dt-bindings ARM: dts: am43xx: fix SLEWCTRL_FAST pinctrl binding 2015-03-06 09:21:03 -08:00
keys
kvm arm/arm64: KVM: Keep elrsr/aisr in sync with software model 2015-03-14 13:42:07 +01:00
linux mm: move zone lock to a different cache line than order-0 free page lists 2015-04-07 16:45:33 -07:00
math-emu
media [media] tea575x: split and export functions 2015-01-27 10:13:50 -02:00
memory
misc
net ipv6: protect skb->sk accesses from recursive dereference inside the stack 2015-04-06 16:12:49 -04:00
pcmcia
ras
rdma Revert "IB/core: Add support for extended query device caps" 2015-02-06 00:54:33 -08:00
rxrpc
scsi Merge branch 'for-3.20/core' of git://git.kernel.dk/linux-block 2015-02-12 14:13:23 -08:00
soc pm: at91: Workaround DDRSDRC self-refresh bug with LPDDR1 memories. 2015-03-03 19:43:59 +01:00
sound ALSA: pcm: allow for trigger_tstamp snapshot in .trigger 2015-02-09 16:01:53 +01:00
target target: do not reject FUA CDBs when write cache is enabled but emulate_write_cache is 0 2015-03-19 23:26:46 -07:00
trace regmap: introduce regmap_name to fix syscon regmap trace events 2015-03-19 20:04:55 +00:00
uapi Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input 2015-04-03 14:58:48 -07:00
video OMAPDSS: fix regression with display sysfs files 2015-02-26 10:23:15 +02:00
xen xen: Remove trailing semicolon from xenbus_register_frontend() definition 2015-03-02 10:38:59 +00:00
Kbuild