Commit graph

116857 commits

Author SHA1 Message Date
7c672dd601 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc
Pull sparc fixes from David Miller:
 "Just some missing syscall wire ups"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  sparc: Wire up mlock2 system call.
  sparc: Add all necessary direct socket system calls.
2015-12-31 14:46:49 -08:00
David S. Miller
42d85c52f8 sparc: Wire up mlock2 system call.
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-31 15:38:56 -05:00
David S. Miller
8b30ca73b7 sparc: Add all necessary direct socket system calls.
The GLIBC folks would like to eliminate socketcall support
eventually, and this makes sense regardless so wire them
all up.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-31 15:18:02 -05:00
Timo Sigurdsson
694341cf20 ARM: Fix broken USB support in sunxi_defconfig
Commit 69fb4dcada ("power: Add an axp20x-usb-power driver") introduced a new
driver for the USB power supply used on various Allwinner based SBCs. However,
the driver was not added to sunxi_defconfig which breaks USB support for some
boards (e.g. LeMaker BananaPi) as the kernel will now turn off the USB power
supply during boot by default if the driver isn't present. (This was not the
case in linux 4.3 or lower where the USB power was always left on.)

Hence, add the driver to sunxi_defconfig in order to keep USB support working
on those boards that require it.

Signed-off-by: Timo Sigurdsson <public_timo.s@silentcreek.de>
Reported-by: David Tulloh <david@tulloh.id.au>
Tested-by: David Tulloh <david@tulloh.id.au>
Tested-by: Timo Sigurdsson <public_timo.s@silentcreek.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
2015-12-31 17:01:59 +01:00
Daniel J Blueman
dd7a5ab495 x86/numachip: Fix NumaConnect2 MMCFG PCI access
The MMCFG PCI accessors weren't being setup for NumacConnect2
correctly due to over-early assignment; this would create the
potential for the wrong PCI domain to be accessed.

Fix this by using the correct arch-specific PCI init function.

Signed-off-by: Daniel J Blueman <daniel@numascale.com>
Acked-by: Steffen Persvold <sp@numascale.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1451498807-15920-1-git-send-email-daniel@numascale.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-12-30 19:19:03 +01:00
Heiko Carstens
9236b4dd6b s390: get rid of CONFIG_SCHED_MC and CONFIG_SCHED_BOOK
Use CONFIG_TOPOLOGY which selects CONFIG_SCHED_* all over the place to
reduce the random usage of the previous config options.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-12-30 10:34:57 +01:00
Heiko Carstens
c095a94932 s390/Kconfig: remove pointless 64 bit dependencies
s390 is always 64 bit.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-12-30 10:34:47 +01:00
866be88a1a Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "9 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm/vmstat: fix overflow in mod_zone_page_state()
  ocfs2/dlm: clear migration_pending when migration target goes down
  mm/memory_hotplug.c: check for missing sections in test_pages_in_a_zone()
  ocfs2: fix flock panic issue
  m32r: add io*_rep helpers
  m32r: fix build failure
  arch/x86/xen/suspend.c: include xen/xen.h
  mm: memcontrol: fix possible memcg leak due to interrupted reclaim
  ocfs2: fix BUG when calculate new backup super
2015-12-29 18:04:41 -08:00
Sudip Mukherjee
92a8ed4c76 m32r: add io*_rep helpers
m32r allmodconfig was failing with the error:

  error: implicit declaration of function 'read'

On checking io.h it turned out that 'read' is not defined but 'readb' is
defined and 'ioread8' will then obviously mean 'readb'.

At the same time some of the helper functions ioreadN_rep() and
iowriteN_rep() were missing which also led to the build failure.

Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-29 17:45:49 -08:00
Sudip Mukherjee
6122192eb6 m32r: fix build failure
m32r allmodconfig is failing with:

  In file included from ../include/linux/kvm_para.h:4:0,
                   from ../kernel/watchdog.c:26:
  ../include/uapi/linux/kvm_para.h:30:26: fatal error: asm/kvm_para.h: No such file or directory

kvm_para.h was not included in the build.

Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-29 17:45:49 -08:00
Andrew Morton
facca61683 arch/x86/xen/suspend.c: include xen/xen.h
Fix the build warning:

  arch/x86/xen/suspend.c: In function 'xen_arch_pre_suspend':
  arch/x86/xen/suspend.c:70:9: error: implicit declaration of function 'xen_pv_domain' [-Werror=implicit-function-declaration]
          if (xen_pv_domain())
              ^

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-29 17:45:49 -08:00
Guenter Roeck
398c7500a1 MIPS: VDSO: Fix build error with binutils 2.24 and earlier
Commit 2a037f310b ("MIPS: VDSO: Fix build error") tries to fix a build
error seen with binutils 2.24 and earlier. However, the fix does not work,
and again results in the already known build errors if the kernel is built
with an earlier version of binutils.

CC      arch/mips/vdso/gettimeofday.o
/tmp/ccnOVbHT.s: Assembler messages:
/tmp/ccnOVbHT.s:50: Error: can't resolve `_start' {*UND* section} - `L0 {.text section}
/tmp/ccnOVbHT.s:374: Error: can't resolve `_start' {*UND* section} - `L0 {.text section}
scripts/Makefile.build:258: recipe for target 'arch/mips/vdso/gettimeofday.o' failed
make[2]: *** [arch/mips/vdso/gettimeofday.o] Error 1

Fixes: 2a037f310b ("MIPS: VDSO: Fix build error")
Cc: Qais Yousef <qais.yousef@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/11926/
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-12-29 23:41:55 +01:00
Al Viro
76cc404bfd [PATCH] arm: fix handling of F_OFD_... in oabi_fcntl64()
Cc: stable@vger.kernel.org # 3.15+
Reviewed-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-29 13:04:21 -05:00
Laura Abbott
9abb0ecdee x86/mm: Drop WARN from multi-BAR check
ioremapping multiple BARs produces a warning with a message "Your kernel is
fine". This message mostly serves to comfort kernel developers. Users do
not read the message, they only see the big scary warning which means
something must be horribly broken with their system. Less dramatically, the
warn also sets the taint flag which makes it difficult to differentiate
problems. If the kernel is actually fine as the warning claims it doesn't
make sense for it to be tainted. Change the WARN_ONCE to a pr_warn with the
caller of the ioremap.

Signed-off-by: Laura Abbott <labbott@fedoraproject.org>
Link: http://lkml.kernel.org/r/1450728074-31029-1-git-send-email-labbott@fedoraproject.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-12-29 12:34:38 +01:00
Jan Beulich
0d430e3fb3 x86/LDT: Print the real LDT base address
This was meant to print base address and entry count; make it do so
again.

Fixes: 37868fe113 "x86/ldt: Make modify_ldt synchronous"
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andy Lutomirski <luto@kernel.org>
Link: http://lkml.kernel.org/r/56797D8402000078000C24F0@prv-mh.provo.novell.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-12-29 12:34:38 +01:00
chengang@emindsoft.com.cn
0105c8d833 arch/x86/kernel/ptrace.c: Remove unused arg_offs_table
The related warning from gcc 6.0:

  arch/x86/kernel/ptrace.c:127:18: warning: ‘arg_offs_table’ defined but not used [-Wunused-const-variable]
   static const int arg_offs_table[] = {
                    ^~~~~~~~~~~~~~

Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Link: http://lkml.kernel.org/r/1451137798-28701-1-git-send-email-chengang@emindsoft.com.cn
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-12-29 11:35:34 +01:00
Thomas Falcon
032c5e8284 Driver for IBM System i/p VNIC protocol
This is a new device driver for a high performance SR-IOV assisted virtual
network for IBM System p and IBM System i systems.  The SR-IOV VF will be
attached to the VIOS partition and mapped to the Linux client via the
hypervisor's VNIC protocol that this driver implements.

This driver is able to perform basic tx and rx, new features
and improvements will be added as they are being developed and tested.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: John Allen <jallen@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-28 00:12:13 -05:00
3ae86f1a9f Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
Pull MIPS fixes from Ralf Baechle:

 - Fix bitrot in __get_user_unaligned()
 - EVA userspace accessor bug fixes.
 - Fix for build issues with certain toolchains.
 - Fix build error for VDSO with particular toolchain versions.
 - Fix build error due to a variable that should have been removed by an
   earlier patch

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
  MIPS: Fix bitrot in __get_user_unaligned()
  MIPS: Fix build error due to unused variables.
  MIPS: VDSO: Fix build error
  MIPS: CPS: drop .set mips64r2 directives
  MIPS: uaccess: Take EVA into account in [__]clear_user
  MIPS: uaccess: Take EVA into account in __copy_from_user()
  MIPS: uaccess: Fix strlen_user with EVA
2015-12-27 18:12:21 -08:00
db0665012c ARM: SoC fixes for v4.4
A handful of fixes for OMAP, i.MX, Allwinner and Tegra:
 
 - A clock rate and a PHY setup fix for i.MX6Q/DL
 - A couple of fixes for the reduced serial bus (sunxi-rsb) on Allwinner
 - UART wakeirq fix for an OMAP4 board, timer config fixes for AM43XX.
 - Suspend fix for Tegra124 Chromebooks
 - Fix for missing implicit include that's different between ARM/ARM64
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWgDAQAAoJEIwa5zzehBx3muAP/RJD+oERDw/XiBSCl4k4kvTZ
 M4ep2ExO/GPXewKXyK7rlbNWMitBwttdT5OmXuuxT7BA/ODAGk90uvZSebIHRxkh
 bqnt+3njeR3M6FrTp6Np+yCh3I0hLYJoRInV3Vh8XQcml0LobIIq2BbdahIg/2JO
 sSDLAzJSR15CfnnKOSexvfRFoslLaKG0ffbxmR32HiMgnbq8ZfkRr7fANsrXTzoX
 oHxEV0uQXBY0d1TQ71rT/4KdTKN9QsdJI6xAjUGH95MYu+ZE0DEnW/wodd/f4e+p
 dyV+qoS2LHJnhYgJho3r19icM0iyNRm0Yt0GqP7ERN/y5GGa41slE9eThMG7WQ4h
 Ot5DEF2GmP7tSYhp4pqEeLqHnfou9+WzxxNP6wxGceqlg9EPuPRtzdCPStZYW4rD
 6+SQCvFs0vHZlEbbAfQLhRgDpvr9enGjCcWm6ntUcgwxFy0CPmRV9g5gIeipZOwi
 QfJM233tudUkDQxJYrCEgKxHy3a7T3K9LgYMM0VO1JRAEpaPIBmxZ0A+Y84WJbTE
 7r+r3eDC8RFF3bLbNRb/Ogt5OHOQUdElhRXcyz/BYA/X4HOT4wyVNM12aBlmW9uN
 aDB4HWE2lKV/h2pxWRR1Hem1NHYRUpcdVzkwRvncDHnxCAmb82BE2x1Ub3+fqa0v
 f0eO7GjUDRdPsQXn0zZn
 =kNRV
 -----END PGP SIGNATURE-----

Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC fixes from Olof Johansson:
 "A smallish set of fixes that we've been sitting on for a while now,
  flushing the queue here so they go in.  Summary:

  A handful of fixes for OMAP, i.MX, Allwinner and Tegra:

   - A clock rate and a PHY setup fix for i.MX6Q/DL
   - A couple of fixes for the reduced serial bus (sunxi-rsb) on
     Allwinner
   - UART wakeirq fix for an OMAP4 board, timer config fixes for AM43XX.
   - Suspend fix for Tegra124 Chromebooks
   - Fix for missing implicit include that's different between
     ARM/ARM64"

* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  ARM: tegra: Fix suspend hang on Tegra124 Chromebooks
  bus: sunxi-rsb: Fix peripheral IC mapping runtime address
  bus: sunxi-rsb: Fix primary PMIC mapping hardware address
  ARM: dts: Fix UART wakeirq for omap4 duovero parlor
  ARM: OMAP2+: AM43xx: select ARM TWD timer
  ARM: OMAP2+: am43xx: enable GENERIC_CLOCKEVENTS_BROADCAST
  fsl-ifc: add missing include on ARM64
  ARM: dts: imx6: Fix Ethernet PHY mode on Ventana boards
  ARM: dts: imx: Fix the assigned-clock mismatch issue on imx6q/dl
  bus: sunxi-rsb: unlock on error in sunxi_rsb_read()
  ARM: dts: sunxi: sun6i-a31s-primo81.dts: add touchscreen axis swapping property
2015-12-27 18:06:31 -08:00
Al Viro
930c0f708e MIPS: Fix bitrot in __get_user_unaligned()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-12-27 20:07:44 +01:00
Andrew Donnellan
dc3799bb9a powerpc/powernv: Fix minor off-by-one error in opal_mce_check_early_recovery()
Fix off-by-one error in opal_mce_check_early_recovery() when checking
whether the NIP falls within OPAL space.

Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-27 19:12:41 +11:00
Andrew Donnellan
759fb100b2 powerpc: Fix style of self-test config prompts
A few of the config prompts for powerpc self-tests have periods at the
end, which is inconsistent with the rest of the prompts. Remove the
periods.

Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-27 19:12:40 +11:00
Michael Neuling
57a9039052 powerpc/powernv: Only delay opal_rtc_read() retry when necessary
Only delay opal_rtc_read() when busy and are going to retry.

This has the advantage of possibly saving a massive 10ms off booting!

Kudos to Stewart for noticing.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Reviewed-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-27 19:12:40 +11:00
Russell Currey
affddff69c powerpc/powernv: Add a kmsg_dumper that flushes console output on panic
On BMC machines, console output is controlled by the OPAL firmware and is
only flushed when its pollers are called.  When the kernel is in a panic
state, it no longer calls these pollers and thus console output does not
completely flush, causing some output from the panic to be lost.

Output is only actually lost when the kernel is configured to not power off
or reboot after panic (i.e. CONFIG_PANIC_TIMEOUT is set to 0) since OPAL
flushes the console buffer as part of its power down routines.  Before this
patch, however, only partial output would be printed during the timeout wait.

This patch adds a new kmsg_dumper which gets called at panic time to ensure
panic output is not lost.  It accomplishes this by calling OPAL_CONSOLE_FLUSH
in the OPAL API, and if that is not available, the pollers are called enough
times to (hopefully) completely flush the buffer.

The flushing mechanism will only affect output printed at and before the
kmsg_dump call in kernel/panic.c:panic().  As such, the "end Kernel panic"
message may still be truncated as follows:

>Call Trace:
>[c000000f1f603b00] [c0000000008e9458] dump_stack+0x90/0xbc (unreliable)
>[c000000f1f603b30] [c0000000008e7e78] panic+0xf8/0x2c4
>[c000000f1f603bc0] [c000000000be4860] mount_block_root+0x288/0x33c
>[c000000f1f603c80] [c000000000be4d14] prepare_namespace+0x1f4/0x254
>[c000000f1f603d00] [c000000000be43e8] kernel_init_freeable+0x318/0x350
>[c000000f1f603dc0] [c00000000000bd74] kernel_init+0x24/0x130
>[c000000f1f603e30] [c0000000000095b0] ret_from_kernel_thread+0x5c/0xac
>---[ end Kernel panic - not

This functionality is implemented as a kmsg_dumper as it seems to be the
most sensible way to introduce platform-specific functionality to the
panic function.

Signed-off-by: Russell Currey <ruscur@russell.cc>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-27 19:12:40 +11:00
Michael Neuling
2fc251a8dd powerpc: Copy only required pieces of the mm_context_t to the paca
Currently we copy the whole mm_context_t to the paca but only access a
few bits of it.  This is wasteful of space paca and also takes quite
some time in the hot path of context switching.

This patch pulls in only the required bits from the mm_context_t to
the paca and on context switch, copies only those.

Benchmarking this (On top of Anton's recent MSR context switching
changes [1]) using processes and yield shows an improvement of almost
3% on POWER8:

  http://ozlabs.org/~anton/junkcode/context_switch2.c
  ./context_switch2 --test=yield --process 0 0

1. https://lists.ozlabs.org/pipermail/linuxppc-dev/2015-October/135700.html

Signed-off-by: Michael Neuling <mikey@neuling.org>
[mpe: Rename paca fields to be mm_ctx_foo rather than context_foo]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-27 19:12:14 +11:00
12261f4ed4 ARC fixes for
- Unwinder rework (A revert followed by better fix)
  - Build errors: MMUv2, modules with -Os
  - highmem section mismatch build splat
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWfjpNAAoJEGnX8d3iisJeC08P+wRkkvtK0R5iZPqfTdSUaxFz
 CYoVM8mAMF+non+b7a4bWHChDJ+HlHxqsQWbq2A/EsET/TOXmnr28KquC/qfplwd
 p2I38xxsCSHK0hU1+x47VdxZdtO5VNdp+fsq7CQbHLUwg9TMCK4t8OpslrA3IbOG
 W+YlHQ+7Wwgu1GbHhWOHDTPb/yqX0WByX211HomURlo9Ppi3Xv3W6xK7uGn0wUxS
 DKp0pEWhLs8UQh6zGAf3zFJVjuuWpJHwAoXmQgfEeXYkA/ZOBCEh8sf+/fFZMxuK
 RAItN1ZTV3HwIYh3ykJyap877MBksb7nsCE5R0l+LQW0nV485omfJgWvo2hFUL5D
 Nrr93bcu8msn/mKfYtBaaQdohwlGsNY5GFaRGEv6TgKZWifFX+uUOBnZnbr2RLvn
 3/cDeeK/24sYvxaHGvBs3tjnJkwWPpofr9ZHCnQT+X8MgxjHQhcBesDq/8lb90m6
 G2RRqXQomdKkMBacejpKyvnulTEwMrU9itpxyIT7wfxCVGKxfdWHNHLW8KNtXsDn
 02Ae7XUyuciV6RIF9lXyGa+I73T/4/ut02b4Qq7PrdfLQucwZKjIQ9nj1Om9+taV
 1nSdm+VttO/Nqu2PoMKluxNHBpS1kxzEVFZWa/mXfFs+qmwRMLyKqjz0Tog+hBCv
 O7951Yzp3J88HTQIwUMD
 =h2pT
 -----END PGP SIGNATURE-----

Merge tag 'arc-4.4-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc

Pull ARC fixes from Vineet Gupta:
 "Sorry for this late pull request, but these are all important fixes
  for code introduced/updated in this release which we will otherwise
  end up back porting.

   - Unwinder rework (A revert followed by better fix)
   - Build errors: MMUv2, modules with -Os
   - highmem section mismatch build splat"

* tag 'arc-4.4-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  ARC: dw2 unwind: Catch Dwarf SNAFUs early
  ARC: dw2 unwind: Don't bail for CIE.version != 1
  Revert "ARC: dw2 unwind: Ignore CIE version !=1 gracefully instead of bailing"
  ARC: Fix linking errors with CONFIG_MODULE + CONFIG_CC_OPTIMIZE_FOR_SIZE
  ARC: mm: fix building for MMU v2
  ARC: mm: HIGHMEM: Fix section mismatch splat
2015-12-26 14:58:06 -08:00
8db7b3c544 Merge branch 'parisc-4.4-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc system call restart fix from Helge Deller:
 "The architectural design of parisc always uses two instructions to
  call kernel syscalls (delayed branch feature).  This means that the
  instruction following the branch (located in the delay slot of the
  branch instruction) is executed before control passes to the branch
  destination.

  Depending on which assembler instruction and how it is used in
  usersapce in the delay slot, this sometimes made restarted syscalls
  like futex() and poll() failing with -ENOSYS"

* 'parisc-4.4-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Fix syscall restarts
2015-12-25 13:19:50 -08:00
682cb0cd82 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc
Pull sparc fixes from David Miller:

 1) Finally make perf stack backtraces stable on sparc, several problems
    (mostly due to the context in which the user copies from the stack
    are done) contributed to this.

    From Rob Gardner.

 2) Export ADI capability if the cpu supports it.

 3) Hook up userfaultfd system call.

 4) When faults happen during user copies we really have to clean up and
    restore the FPU state fully.  Also from Rob Gardner

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  tty/serial: Skip 'NULL' char after console break when sysrq enabled
  sparc64: fix FP corruption in user copy functions
  sparc64: Perf should save/restore fault info
  sparc64: Ensure perf can access user stacks
  sparc64: Don't set %pil in rtrap_nmi too early
  sparc64: Add ADI capability to cpu capabilities
  tty: serial: constify sunhv_ops structs
  sparc: Hook up userfaultfd system call
2015-12-25 13:15:23 -08:00
Rob Gardner
a7c5724b5c sparc64: fix FP corruption in user copy functions
Short story: Exception handlers used by some copy_to_user() and
copy_from_user() functions do not diligently clean up floating point
register usage, and this can result in a user process seeing invalid
values in floating point registers. This sometimes makes the process
fail.

Long story: Several cpu-specific (NG4, NG2, U1, U3) memcpy functions
use floating point registers and VIS alignaddr/faligndata to
accelerate data copying when source and dest addresses don't align
well. Linux uses a lazy scheme for saving floating point registers; It
is not done upon entering the kernel since it's a very expensive
operation. Rather, it is done only when needed. If the kernel ends up
not using FP regs during the course of some trap or system call, then
it can return to user space without saving or restoring them.

The various memcpy functions begin their FP code with VISEntry (or a
variation thereof), which saves the FP regs. They conclude their FP
code with VISExit (or a variation) which essentially marks the FP regs
"clean", ie, they contain no unsaved values. fprs.FPRS_FEF is turned
off so that a lazy restore will be triggered when/if the user process
accesses floating point regs again.

The bug is that the user copy variants of memcpy, copy_from_user() and
copy_to_user(), employ an exception handling mechanism to detect faults
when accessing user space addresses, and when this handler is invoked,
an immediate return from the function is forced, and VISExit is not
executed, thus leaving the fprs register in an indeterminate state,
but often with fprs.FPRS_FEF set and one or more dirty bits. This
results in a return to user space with invalid values in the FP regs,
and since fprs.FPRS_FEF is on, no lazy restore occurs.

This bug affects copy_to_user() and copy_from_user() for NG4, NG2,
U3, and U1. All are fixed by using a new exception handler for those
loads and stores that are done during the time between VISEnter and
VISExit.

n.b. In NG4memcpy, the problematic code can be triggered by a copy
size greater than 128 bytes and an unaligned source address.  This bug
is known to be the cause of random user process memory corruptions
while perf is running with the callgraph option (ie, perf record -g).
This occurs because perf uses copy_from_user() to read user stacks,
and may fault when it follows a stack frame pointer off to an
invalid page. Validation checks on the stack address just obscure
the underlying problem.

Signed-off-by: Rob Gardner <rob.gardner@oracle.com>
Signed-off-by: Dave Aldridge <david.j.aldridge@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-24 12:13:18 -05:00
Rob Gardner
833526941f sparc64: Perf should save/restore fault info
There have been several reports of random processes being killed with
a bus error or segfault during userspace stack walking in perf.  One
of the root causes of this problem is an asynchronous modification to
thread_info fault_address and fault_code, which stems from a perf
counter interrupt arriving during kernel processing of a "benign"
fault, such as a TSB miss. Since perf_callchain_user() invokes
copy_from_user() to read user stacks, a fault is not only possible,
but probable. Validity checks on the stack address merely cover up the
problem and reduce its frequency.

The solution here is to save and restore fault_address and fault_code
in perf_callchain_user() so that the benign fault handler is not
disturbed by a perf interrupt.

Signed-off-by: Rob Gardner <rob.gardner@oracle.com>
Signed-off-by: Dave Aldridge <david.j.aldridge@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-24 12:12:46 -05:00
Rob Gardner
3f74306ac8 sparc64: Ensure perf can access user stacks
When an interrupt (such as a perf counter interrupt) is delivered
while executing in user space, the trap entry code puts ASI_AIUS in
%asi so that copy_from_user() and copy_to_user() will access the
correct memory. But if a perf counter interrupt is delivered while the
cpu is already executing in kernel space, then the trap entry code
will put ASI_P in %asi, and this will prevent copy_from_user() from
reading any useful stack data in either of the perf_callchain_user_X
functions, and thus no user callgraph data will be collected for this
sample period. An additional problem is that a fault is guaranteed
to occur, and though it will be silently covered up, it wastes time
and could perturb state.

In perf_callchain_user(), we ensure that %asi contains ASI_AIUS
because we know for a fact that the subsequent calls to
copy_from_user() are intended to read the user's stack.

[ Use get_fs()/set_fs() -DaveM ]

Signed-off-by: Rob Gardner <rob.gardner@oracle.com>
Signed-off-by: Dave Aldridge <david.j.aldridge@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-24 12:10:29 -05:00
Rob Gardner
1ca04a4ce0 sparc64: Don't set %pil in rtrap_nmi too early
Commit 28a1f53 delays setting %pil to avoid potential
hardirq stack overflow in the common rtrap_irq path.
Setting %pil also needs to be delayed in the rtrap_nmi
path for the same reason.

Signed-off-by: Rob Gardner <rob.gardner@oracle.com>
Signed-off-by: Dave Aldridge <david.j.aldridge@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-24 12:07:16 -05:00
Khalid Aziz
82924e542f sparc64: Add ADI capability to cpu capabilities
Add ADI (Application Data Integrity) capability to cpu capabilities list.
ADI capability allows virtual addresses to be encoded with a tag in
bits 63-60. This tag serves as an access control key for the regions
of virtual address with ADI enabled and a key set on them. Hypervisor
encodes this capability as "adp" in "hwcap-list" property in machine
description.

Signed-off-by: Khalid Aziz <khalid.aziz@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-24 12:05:06 -05:00
Hongtao Jia
3045e409e4 powerpc/mpc85xx: Add TMU device tree support for T1023/T1024
Also add nodes and properties for thermal management support. Meanwhile
preprocessor support is needed using thermal of framework.

Signed-off-by: Jia Hongtao <hongtao.jia@freescale.com>
Reviewed-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2015-12-23 22:21:11 -06:00
Hongtao Jia
be489a3936 powerpc/mpc85xx: Add TMU device tree support for T1040/T1042
Also add nodes and properties for thermal management support. Meanwhile
preprocessor support is needed using thermal of framework.

Signed-off-by: Jia Hongtao <hongtao.jia@freescale.com>
Reviewed-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2015-12-23 22:21:11 -06:00
Raghav Dogra
479f6a7fc6 powerpc/fsl_lbc: removal of dead code
The condition check is not used.

Signed-off-by: Raghav Dogra <raghav@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2015-12-23 15:02:00 -06:00
Mike Kravetz
9bcfd78ac0 sparc: Hook up userfaultfd system call
After hooking up system call, userfaultfd selftest was successful for
both 32 and 64 bit version of test.

Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-23 15:41:13 -05:00
Li Bin
e9b349f089 metag: ftrace: Fix the comments for ftrace_modify_code
There is no need to worry about module and __init text disappearing
case, because that ftrace has a module notifier that is called when
a module is being unloaded and before the text goes away and this
code grabs the ftrace_lock mutex and removes the module functions
from the ftrace list, such that it will no longer do any
modifications to that module's text, the update to make functions
be traced or not is done under the ftrace_lock mutex as well.
And by now, __init section codes should not been modified
by ftrace, because it is black listed in recordmcount.c and
ignored by ftrace.

Link: http://lkml.kernel.org/r/1449367378-29430-3-git-send-email-huawei.libin@huawei.com

Cc: linux-metag@vger.kernel.org
Acked-by: James Hogan <james.hogan@imgtec.com>
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Li Bin <huawei.libin@huawei.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-12-23 14:27:25 -05:00
Li Bin
5243238ad5 sh: ftrace: Fix the comments for ftrace_modify_code()
There is no need to worry about module and __init text disappearing
case, because that ftrace has a module notifier that is called when
a module is being unloaded and before the text goes away and this
code grabs the ftrace_lock mutex and removes the module functions
from the ftrace list, such that it will no longer do any
modifications to that module's text, the update to make functions
be traced or not is done under the ftrace_lock mutex as well.
And by now, __init section codes should not been modified
by ftrace, because it is black listed in recordmcount.c and
ignored by ftrace.

Link: http://lkml.kernel.org/r/1449367378-29430-5-git-send-email-huawei.libin@huawei.com

Cc: linux-sh@vger.kernel.org
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Li Bin <huawei.libin@huawei.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-12-23 14:27:24 -05:00
Li Bin
cbbe12c43d ia64: ftrace: Fix the comments for ftrace_modify_code()
There is no need to worry about module and __init text disappearing
case, because that ftrace has a module notifier that is called when
a module is being unloaded and before the text goes away and this
code grabs the ftrace_lock mutex and removes the module functions
from the ftrace list, such that it will no longer do any
modifications to that module's text, the update to make functions
be traced or not is done under the ftrace_lock mutex as well.
And by now, __init section codes should not been modified
by ftrace, because it is black listed in recordmcount.c and
ignored by ftrace.

Link: http://lkml.kernel.org/r/1449214067-12177-2-git-send-email-huawei.libin@huawei.com

Cc: linux-ia64@vger.kernel.org
Acked-by: Tony Luck <tony.luck@intel.com>
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Li Bin <huawei.libin@huawei.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-12-23 14:27:23 -05:00
Al Viro
c62432b40b [mips] switch pvc_proc_cleanup() to remove_proc_subtree()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-23 10:41:38 -05:00
Al Viro
b25472f9b9 new helpers: no_seek_end_llseek{,_size}()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-23 10:41:31 -05:00
Thomas Gleixner
b92c453d52 Revert "x86/kvm: On KVM re-enable (e.g. after suspend), update clocks"
This reverts commit 677a73a9aa. This patch was not meant to be merged and
has issues. Revert it.

Requested-by: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-12-23 12:35:21 +01:00
Zhao Qiang
b0417a2870 powerpc/p1010rdb: Update dts for pcie interrupt-map
p1010rdb uses the irq[4:5] for inta and intb to pcie,
it is active-high, so set it.

Signed-off-by: Zhao Qiang <qiang.zhao@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2015-12-22 18:23:22 -06:00
Scott Wood
2749539b4b powerpc/e6500: add locking to hugetlb
e6500 has threads but does not have TLB write conditional.  Thus,
the hugetlb code needs to take the same lock that the normal TLB miss
handlers take, to ensure that the tlbsx and tlbwe are atomic.

Signed-off-by: Scott Wood <scottwood@freescale.com>
2015-12-22 18:23:22 -06:00
li pengbo
9d68e7accf powerpc/85xx: Enable TWR_P102x in mpc85xx_basic_defconfig
Enable TWR_P102x option by default in mpc85xx_basic_defconfig to support
p1025twr board.

Signed-off-by: Pengbo Li <Pengbo.Li@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2015-12-22 18:23:21 -06:00
Daniel Walker
433c858a61 powerpc/85xx: mpc85xx ADS: remove pci exclude
This code was reworked in commit,

905e75c46d

This change removed the fsl_add_bridge() which originally was above
the addition of the pci_exclude_device function. I think the assumption was that
the pci_exclude_device would prevent changes to the bridge PCI config after
it's been added. It seems it wasn't fully tested on MPC85xx ADS because
if you move the fsl_add_bridge() the pci_exclude_device is set in the machine
description then you can never update the PCI Config since the exclude
prevents it. This disrupts things like DMA.

This issue was extensively debugged by David Beazley.

Cc: xe-kernel@external.cisco.com
Cc: dbeazley@cisco.com
Cc: dwalker@fifo99.com
Signed-off-by: Daniel Walker <danielwa@cisco.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2015-12-22 18:23:21 -06:00
Igal Liberman
4e9de5e970 powerpc/mpc85xx: Update B4 FMan MURAM size
FMan V3H has 2 different MURAM sizes:
    In B4860/4420 the MURAM size is 512KB.
    In T4240 and T2080 the MURAM size is 384KB.

The MURAM size in FMan V3H device tree is 384KB.
This patch updates the MURAM size for B4 to 512KB.

Signed-off-by: Igal Liberman <igal.liberman@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2015-12-22 18:23:11 -06:00
Harninder Rai
720d7aebcd powerpc/85xx: Add PCIe controller support for bsc9132qds
1. Use machine_arch_initcall to hook mpc85xx_common_publish_devices
This can ensure before pcibios_init() is called, pci controllers have
been probed and added to the hose_list.
2. Add a workaround for errata A-005434
For the BSC9132, PEX_PEXIWARn[TRGT] for all windows defaults to 0xF,
which is mapped to CCSRBAR. However, for other products, 0xF is
mapped to the local memory. Therefore, for the BSC9132, any default
PCI Express access to the local memory (DDR) will now access the
CCSRBAR. This patch changes the mapping of targets of inbound windows
PEX_PEXIWARn[TRGT] to the Local address space – 0x0 (from 0xF).

Signed-off-by: Harninder Rai <harninder.rai@freescale.com>
Signed-off-by: Minghuan Lian <Minghuan.Lian@freescale.com>
Signed-off-by: Hou Zhiqiang <B48286@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2015-12-22 18:17:15 -06:00
Harninder Rai
230dd6059a powerpc/fsl: Add PCI node in device tree of bsc9132qds
Signed-off-by: Harninder Rai <harninder.rai@freescale.com>
Signed-off-by: Minghuan Lian <Minghuan.Lian@freescale.com>
Signed-off-by: Hou Zhiqiang <B48286@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2015-12-22 18:17:14 -06:00
e73a31778a KVM/ARM fixes for v4.4-rc7
- A series of fixes to the MTRR emulation, tested in the BZ by several users
   so they should be safe this late
 
 - A fix for a division by zero
 
 - Two very simple ARM and PPC fixes
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJWeV/6AAoJEL/70l94x66DqzsH/05YnLi2GsX5WeZHMfIUgzgT
 S/GoIkA7A4E2eXVoGg824MWppSViUzZkWgYFQTG4+KY9WPXzm9z2ij7DIlUHCD6n
 QfevgQx1kIu1obyhm6bYM2xUdM3f7NCsQgw9bXZObB0ay+b/+GjR9/RbCbx60EO5
 K1P+kveK6PFlS9/hc0PLztu6WkPV9BCO1RJUbeAEdnrMbpuQfHC+coR7MHRCiv2V
 iy8f1CqrGaO5YPm9/3GbdH1xMKew4OZShOxTXwtvUThdrLkks2c8sk6FoLzqkznH
 LMHVIpkm4mrIgThZG7VqZMXOrWvBtsCt04Vr9MzCM6QetB02b/Uz0xKvMYx2kZQ=
 =pmYz
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:

 - A series of fixes to the MTRR emulation, tested in the BZ by several
   users so they should be safe this late

 - A fix for a division by zero

 - Two very simple ARM and PPC fixes

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: Reload pit counters for all channels when restoring state
  KVM: MTRR: treat memory as writeback if MTRR is disabled in guest CPUID
  KVM: MTRR: observe maxphyaddr from guest CPUID, not host
  KVM: MTRR: fix fixed MTRR segment look up
  KVM: VMX: Fix host initiated access to guest MSR_TSC_AUX
  KVM: arm/arm64: vgic: Fix kvm_vgic_map_is_active's dist check
  kvm: x86: move tracepoints outside extended quiescent state
  KVM: PPC: Book3S HV: Prohibit setting illegal transaction state in MSR
2015-12-22 15:47:39 -08:00
ad3d1abb30 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Martin Schwidefsky:
 "Two late bug fixes for kernel 4.4.

  Merry Christmas"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/dis: Fix handling of format specifiers
  s390/zcrypt: Fix AP queue handling if queue is full
2015-12-22 15:43:18 -08:00
Jon Hunter
80373d37be ARM: tegra: Fix suspend hang on Tegra124 Chromebooks
Enabling CPUFreq support for Tegra124 Chromebooks is causing the Tegra124
to hang when resuming from suspend.

When CPUFreq is enabled, the CPU clock is changed from the PLLX clock to
the DFLL clock during kernel boot. When resuming from suspend the CPU
clock is temporarily changed back to the PLLX clock before switching back
to the DFLL. If the DFLL is operating at a much lower frequency than the
PLLX when we enter suspend, and so the CPU voltage rail is at a voltage
too low for the CPUs to operate at the PLLX frequency, then the device
will hang.

Please note that the PLLX is used in the resume sequence to switch the CPU
clock from the very slow 32K clock to a faster clock during early resume
to speed up the resume sequence before the DFLL is resumed.

Ideally, we should fix this by setting the suspend frequency so that it
matches the PLLX frequency, however, that would be a bigger change. For
now simply disable CPUFreq support for Tegra124 Chromebooks to avoid the
hang when resuming from suspend.

Fixes: 9a0baee960 ("ARM: tegra: Enable CPUFreq support for Tegra124
		      Chromebooks")

Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
2015-12-22 15:41:37 -08:00
Mickaël Salaün
de3793796f um: Fix pointer cast
Fix a pointer cast typo introduced in v4.4-rc5 especially visible for
the i386 subarchitecture where it results in a kernel crash.

[ Also removed pointless cast as per Al Viro - Linus ]

Fixes: 8090bfd2bb ("um: Fix fpstate handling")
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Cc: Jeff Dike <jdike@addtoit.com>
Acked-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-22 15:31:51 -08:00
Zhao Qiang
7aa1aa6ece QE: Move QE from arch/powerpc to drivers/soc
ls1 has qe and ls1 has arm cpu.
move qe from arch/powerpc to drivers/soc/fsl
to adapt to powerpc and arm

Signed-off-by: Zhao Qiang <qiang.zhao@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2015-12-22 17:12:56 -06:00
Zhao Qiang
302c059f2e QE: use subsys_initcall to init qe
Use subsys_initcall to init qe to adapt ARM architecture.
Remove qe_reset from PowerPC platform file.

Signed-off-by: Zhao Qiang <qiang.zhao@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2015-12-22 17:10:18 -06:00
Zhao Qiang
1291e49e89 QE/CPM: move muram management functions to qe_common
QE and CPM have the same muram, they use the same management
functions. Now QE support both ARM and PowerPC, it is necessary
to move QE to "driver/soc", so move the muram management functions
from cpm_common to qe_common for preparing to move QE code to "driver/soc"

Signed-off-by: Zhao Qiang <qiang.zhao@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2015-12-22 17:10:18 -06:00
Zhao Qiang
0e6e01ff69 CPM/QE: use genalloc to manage CPM/QE muram
Use genalloc to manage CPM/QE muram instead of rheap.

Signed-off-by: Zhao Qiang <qiang.zhao@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2015-12-22 17:10:18 -06:00
Olof Johansson
741db4a72a Few fixes for omaps to allow am437x only builds to boot properly with
CPU_IDLE and ARM TWD timer. This is probably a common configuration setup
 for people making products with these SoCs so let's make sure it works.
 
 Also a wakeirq fix for duovero parlor making my life a bit easier as that
 allows me to run basic PM regression tests on it.
 
 It would be nice to have these in v4.4, but if it gets too late for that
 because of the holidays, it is not super critical if these get merged for
 v4.5.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWeGJhAAoJEBvUPslcq6VzU4wQAJqyWnzl5+frGf8TeTJTwAD2
 yDXbiTDYAZAZ4ajFY5aocrAvQfd/z9zU4ar6liylfAMzOM3S4HRDTaUYc2jbr42s
 +uNNRY3EGMRzbLuRfyZZRgui0aImBpIDXhBhK1njQoYWt9wWIVxNCv2I9pImWm+n
 kyZBDK7xIunQTndgMIMTk8MPVQ5tbPrFbkOWUAZT2bIOmhhhjf/fdMpVL47tbssc
 5XuS5YmyglT3spdaYzrAfZbxf19+CMVmTh2KD9FZOCv4OZPXvPOJ3JkqrJeARvPI
 Idt6DjitznShJ7wQuIuRH+uS7N2fEDviZR3Of1kZBawBxt7ig58O7TizZLRUtTk+
 Uj2QuXfccw8qu122chdpCI1eV5oOok157V3Yhsm7Oyi12H/eu5WMee2boJh6cz9R
 KIih+5/FX6yUCgfHow7XR6vpQ2A50BJoU55+EB3yb81hsy/t4eHxbFaJfUcs7DNg
 yLjtAxL1tO96VNFF+Kdgps69AOaHbo7/rNYgRvRZ4uxdSkE+YJhnHfcu6m59lFSR
 k7Ybw+CCvZ+jH3IE+aGLumPguL10ntytv16G2UDwCzuvCaq59fUPUOCdxuBf+RNB
 3Bd1NCkcsi8xPSypf7YXa3AfNM2XkNprHVZJadM57EBE3bVZ19+1OiBtf2gn+YAE
 72gzwGGPixUNr8OjZNX0
 =tGjX
 -----END PGP SIGNATURE-----

Merge tag 'omap-for-v4.4/fixes-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes

Few fixes for omaps to allow am437x only builds to boot properly with
CPU_IDLE and ARM TWD timer. This is probably a common configuration setup
for people making products with these SoCs so let's make sure it works.

Also a wakeirq fix for duovero parlor making my life a bit easier as that
allows me to run basic PM regression tests on it.

It would be nice to have these in v4.4, but if it gets too late for that
because of the holidays, it is not super critical if these get merged for
v4.5.

* tag 'omap-for-v4.4/fixes-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
  ARM: dts: Fix UART wakeirq for omap4 duovero parlor
  ARM: OMAP2+: AM43xx: select ARM TWD timer
  ARM: OMAP2+: am43xx: enable GENERIC_CLOCKEVENTS_BROADCAST

Signed-off-by: Olof Johansson <olof@lixom.net>
2015-12-22 13:24:29 -08:00
Olof Johansson
8b9c13347a The i.MX fixes for 4.4, 3rd round:
- Fix Ethernet PHY mode on i.MX6 Ventana boards, which can result in
   a non-functional Ethernet when Marvell phy driver rather than generic
   phy driver is selected.
 - Fix an assigned-clock configuration bug on imx6qdl-sabreauto board
   which was introduced by commit ed339363de ("ARM: dts:
   imx6qdl-sabreauto: Allow HDMI and LVDS to work simultaneously").
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJWcnPzAAoJEFBXWFqHsHzOvfAIALZo+lVpA3c89ZuDJn0bKfsI
 hsV2wfMW1N5BhLN0AGjNIk/nk/YWhZ57gZyQ8miMdz2VeQRSgcjRaFFfEGRGOGop
 OrrKpgmRKmUNxMrvztLlMmb1hGxpHOO7z/CXO3UrzDRIiw9lzxYymt0Re2UHiE1u
 +tekMe7xBdimfMVmcM8RX/FOZC81tUjH3W8fyPbbzTTgrAnLPmXcfjtHkXy2VPw0
 5ikWCJhFWB9ODnmKHbdA7YsNh52JAe5BieWiLXzLj33XJNDBbiFBbBltbo+BJZnp
 ZaIek/fGHbq7ouCSLdRq7SEQn6ZbjpHhlic/okqNX0ux29QpY2GKTbiq8fLmcLU=
 =wMKP
 -----END PGP SIGNATURE-----

Merge tag 'imx-fixes-4.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into fixes

The i.MX fixes for 4.4, 3rd round:
- Fix Ethernet PHY mode on i.MX6 Ventana boards, which can result in
  a non-functional Ethernet when Marvell phy driver rather than generic
  phy driver is selected.
- Fix an assigned-clock configuration bug on imx6qdl-sabreauto board
  which was introduced by commit ed339363de ("ARM: dts:
  imx6qdl-sabreauto: Allow HDMI and LVDS to work simultaneously").

* tag 'imx-fixes-4.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
  ARM: dts: imx6: Fix Ethernet PHY mode on Ventana boards
  ARM: dts: imx: Fix the assigned-clock mismatch issue on imx6q/dl
2015-12-22 11:49:21 -08:00
Will Deacon
5d7ee87708 arm64: perf: add support for Cortex-A72
Cortex-A72 has a PMUv3 implementation that is compatible with the PMU
implemented by Cortex-A57.

This patch hooks up the new compatible string so that the Cortex-A57
event mappings are used.

Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-12-22 14:45:35 +00:00
Will Deacon
57d7412395 arm64: perf: add format entry to describe event -> config mapping
It's all very well providing an events directory to userspace that
details our events in terms of "event=0xNN", but if we don't define how
to encode the "event" field in the perf attr.config, then it's a waste
of time.

This patch adds a single format entry to describe that the event field
occupies the bottom 10 bits of our config field on ARMv8 (PMUv3).

Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-12-22 14:45:07 +00:00
Will Deacon
abff083ce2 ARM: perf: add format entry to describe event -> config mapping
It's all very well providing an events directory to userspace that
details our events in terms of "event=0xNN", but if we don't define how
to encode the "event" field in the perf attr.config, then it's a waste
of time.

This patch adds a single format entry to describe that the event field
occupies the bottom 8 bits of our config field on ARMv7.

Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-12-22 14:42:57 +00:00
Andrew Honig
0185604c2d KVM: x86: Reload pit counters for all channels when restoring state
Currently if userspace restores the pit counters with a count of 0
on channels 1 or 2 and the guest attempts to read the count on those
channels, then KVM will perform a mod of 0 and crash.  This will ensure
that 0 values are converted to 65536 as per the spec.

This is CVE-2015-7513.

Signed-off-by: Andy Honig <ahonig@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-22 15:36:26 +01:00
Paolo Bonzini
e24dea2afc KVM: MTRR: treat memory as writeback if MTRR is disabled in guest CPUID
Virtual machines can be run with CPUID such that there are no MTRRs.
In that case, the firmware will never enable MTRRs and it is obviously
undesirable to run the guest entirely with UC memory.  Check out guest
CPUID, and use WB memory if MTRR do not exist.

Cc: qemu-stable@nongnu.org
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=107561
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-22 15:29:00 +01:00
Paolo Bonzini
fa7c4ebd5a KVM: MTRR: observe maxphyaddr from guest CPUID, not host
Conversion of MTRRs to ranges used the maxphyaddr from the boot CPU.
This is wrong, because var_mtrr_range's mask variable then is discontiguous
(like FF00FFFF000, where the first run of 0s corresponds to the bits
between host and guest maxphyaddr).  Instead always set up the masks
to be full 64-bit values---we know that the reserved bits at the top
are zero, and we can restore them when reading the MSR.  This way
var_mtrr_range gets a mask that just works.

Fixes: a13842dc66
Cc: qemu-stable@nongnu.org
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=107561
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-22 15:28:56 +01:00
Alexis Dambricourt
a7f2d78657 KVM: MTRR: fix fixed MTRR segment look up
This fixes the slow-down of VM running with pci-passthrough, since some MTRR
range changed from MTRR_TYPE_WRBACK to MTRR_TYPE_UNCACHABLE.  Memory in the
0K-640K range was incorrectly treated as uncacheable.

Fixes: f7bfb57b3e
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=107561
Cc: qemu-stable@nongnu.org
Signed-off-by: Alexis Dambricourt <alexis.dambricourt@gmail.com>
[Use correct BZ for "Fixes" annotation.  - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-22 15:28:37 +01:00
Ralf Baechle
ec7b97208a MIPS: Fix build error due to unused variables.
c861519fcf ("MIPS: Fix delay loops which may
be removed by GCC.") which made it upstream was an outdated version of the
patch and is lacking some the removal of two variables that became unused
thus resulting in further warnings and build breakage.  The commit
from ae878615d7cee5d7346946cf1ae1b60e427013c2 was correct however.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-12-22 15:21:18 +01:00
Linus Walleij
36f46d6d5c ARM: 8482/1: l2x0: make it possible to disable outer sync from DT
According to commit 2503a5ecd8
"ARM: 6201/1: RealView: Do not use outer_sync() on ARM11MPCore
boards with L220" Some PB11MPCore RealView core tiles have broken
outer_sync.

We got rid of the custom barriers from the machine by disabling
outer sync, but that was just for the boardfile case. We have
to be able to do the same in the device tree case.

Since __l2c_init() is cloning and copying the L2C vtable,
we pass an argument to this function to optionally numb
the outer sync operation if desired, before initializing
the cache.

After this we can set up the cache correctly on the RealView
PB11MPCore. This was tested on a PB11MPCore known to have the
issue. Before this, spurious crashes would occur if we try to
set up the cache properly, after this it boots rock solid.

Cc: Arnd Bergmann <arnd@arndb.de>
Cc: devicetree@vger.kernel.org
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-12-22 12:15:53 +00:00
Marc Zyngier
e7273ff49a ARM: 8488/1: Make IPI_CPU_BACKTRACE a "non-secure" SGI
Having IPI_CPU_BACKTRACE as SGI15 may not work if the kernel is
running in non-secure mode and that the secure firmware has
decided to follow ARM's recommendations that SGI8-15 should
be reserved for secure purpose.

Now that we are "only" using SGI0-6, change IPI_CPU_BACKTRACE
to use SGI7, which makes it more likely to work.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-12-22 12:09:44 +00:00
Marc Zyngier
89d798b73d ARM: 8487/1: Remove IPI_CALL_FUNC_SINGLE
Since 9a46ad6d6d ("smp: make smp_call_function_many() use logic
similar to smp_call_function_single()"), the core IPI handling
has been simplified, and generic_smp_call_function_interrupt is
now the same as generic_smp_call_function_single_interrupt.

This means that one of IPI_CALL_FUNC and IPI_CALL_FUNC_SINGLE has
become redundant. We can then safely drop IPI_CALL_FUNC_SINGLE,
and use only IPI_CALL_FUNC.

This has the advantage of reducing the number of SGI IDs we're using
(a fairly scarse resource).

Tested on a dual A7 board.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-12-22 12:09:43 +00:00
Lorenzo Pieralisi
f6419f240b ARM: 8485/1: cpuidle: remove cpu parameter from the cpuidle_ops suspend hook
The suspend() hook in the cpuidle_ops struct is always called on
the cpu entering idle, which means that the cpu parameter passed
to the suspend hook always corresponds to the local cpu, making
it somewhat redundant.

This patch removes the logical cpu parameter from the ARM
cpuidle_ops.suspend hook and updates all the existing kernel
implementations to reflect this change.

Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Lina Iyer <lina.iyer@linaro.org>
Tested-by: Lina Iyer <lina.iyer@linaro.org>
Tested-by: Jisheng Zhang <jszhang@marvell.com> [psci]
Cc: Lina Iyer <lina.iyer@linaro.org>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-12-22 12:09:43 +00:00
Qais Yousef
2a037f310b MIPS: VDSO: Fix build error
Commit ebb5e78cc6 ("MIPS: Initial implementation of a VDSO") introduced a
build error.

For MIPS VDSO to be compiled it requires binutils version 2.25 or above but
the check in the Makefile had inverted logic causing it to be compiled in if
binutils is below 2.25.

This fixes the following compilation error:

CC      arch/mips/vdso/gettimeofday.o
/tmp/ccsExcUd.s: Assembler messages:
/tmp/ccsExcUd.s:62: Error: can't resolve `_start' {*UND* section} - `L0' {.text section}
/tmp/ccsExcUd.s:467: Error: can't resolve `_start' {*UND* section} - `L0' {.text section}
make[2]: *** [arch/mips/vdso/gettimeofday.o] Error 1
make[1]: *** [arch/mips/vdso] Error 2
make: *** [arch/mips] Error 2

[ralf@linux-mips: Fixed Sergei's complaint on the formatting of the
cited commit and generally reformatted the log message.]

Signed-off-by: Qais Yousef <qais.yousef@imgtec.com>
Cc: alex@alex-smith.me.uk
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/11745/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-12-22 12:23:31 +01:00
Paul Burton
f3575e230c MIPS: CPS: drop .set mips64r2 directives
Commit 977e043d5e ("MIPS: kernel: cps-vec: Replace mips32r2 ISA level
with mips64r2") leads to .set mips64r2 directives being present in 32
bit (ie. CONFIG_32BIT=y) kernels. This is incorrect & leads to MIPS64
instructions being emitted by the assembler when expanding
pseudo-instructions. For example the "move" instruction can legitimately
be expanded to a "daddu". This causes problems when the kernel is run on
a MIPS32 CPU, as CONFIG_32BIT kernels of course often are...

Fix this by dropping the .set <ISA> directives entirely now that Kconfig
should be ensuring that kernels including this code are built with a
suitable -march= compiler flag.

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: Markos Chandras <markos.chandras@imgtec.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: <stable@vger.kernel.org> # 3.16+
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/10869/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-12-22 12:16:32 +01:00
James Hogan
d6a428fb58 MIPS: uaccess: Take EVA into account in [__]clear_user
__clear_user() (and clear_user() which uses it), always access the user
mode address space, which results in EVA store instructions when EVA is
enabled even if the current user address limit is KERNEL_DS.

Fix this by adding a new symbol __bzero_kernel for the normal kernel
address space bzero in EVA mode, and call that from __clear_user() if
eva_kernel_access().

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Markos Chandras <markos.chandras@imgtec.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/10844/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-12-22 11:58:43 +01:00
James Hogan
6f06a2c45d MIPS: uaccess: Take EVA into account in __copy_from_user()
When EVA is in use, __copy_from_user() was unconditionally using the EVA
instructions to read the user address space, however this can also be
used for kernel access. If the address isn't a valid user address it
will cause an address error or TLB exception, and if it is then user
memory may be read instead of kernel memory.

For example in the following stack trace from Linux v3.10 (changes since
then will prevent this particular one still happening) kernel_sendmsg()
set the user address limit to KERNEL_DS, and tcp_sendmsg() goes on to
use __copy_from_user() with a kernel address in KSeg0.

[<8002d434>] __copy_fromuser_common+0x10c/0x254
[<805710e0>] tcp_sendmsg+0x5f4/0xf00
[<804e8e3c>] sock_sendmsg+0x78/0xa0
[<804e8f28>] kernel_sendmsg+0x24/0x38
[<804ee0f8>] sock_no_sendpage+0x70/0x7c
[<8017c820>] pipe_to_sendpage+0x80/0x98
[<8017c6b0>] splice_from_pipe_feed+0xa8/0x198
[<8017cc54>] __splice_from_pipe+0x4c/0x8c
[<8017e844>] splice_from_pipe+0x58/0x78
[<8017e884>] generic_splice_sendpage+0x20/0x2c
[<8017d690>] do_splice_from+0xb4/0x110
[<8017d710>] direct_splice_actor+0x24/0x30
[<8017d394>] splice_direct_to_actor+0xd8/0x208
[<8017d51c>] do_splice_direct+0x58/0x7c
[<8014eaf4>] do_sendfile+0x1dc/0x39c
[<8014f82c>] SyS_sendfile+0x90/0xf8

Add the eva_kernel_access() check in __copy_from_user() like the one in
copy_from_user().

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Markos Chandras <markos.chandras@imgtec.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/10843/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-12-22 11:55:24 +01:00
James Hogan
5dc62fdd83 MIPS: uaccess: Fix strlen_user with EVA
The strlen_user() function calls __strlen_kernel_asm in both branches of
the eva_kernel_access() conditional. For EVA it should be calling
__strlen_user_eva for user accesses, otherwise it will load from the
kernel address space instead of the user address space, and the access
checking will likely be ineffective at preventing it due to EVA's
overlapping user and kernel address spaces.

This was found after extending the test_user_copy module to cover user
string access functions, which gave the following error with EVA:

test_user_copy: illegal strlen_user passed

Fortunately the use of strlen_user() has been all but eradicated from
the mainline kernel, so only out of tree modules could be affected.

Fixes: e3a9b07a9c ("MIPS: asm: uaccess: Add EVA support for str*_user operations")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Markos Chandras <markos.chandras@imgtec.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 3.15.x-
Patchwork: https://patchwork.linux-mips.org/patch/10842/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-12-22 11:54:13 +01:00
Greg Kroah-Hartman
bfccd95e7a Merge 4.4-rc6 into usb-next
We want the USB and PHY fixes in here as well to make things easier for
testing and development.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-12-21 11:11:12 -08:00
Will Deacon
c9cd0ed925 arm64: traps: address fallout from printk -> pr_* conversion
Commit ac7b406c1a ("arm64: Use pr_* instead of printk") was a fairly
mindless s/printk/pr_*/ change driven by a complaint from checkpatch.

As is usual with such changes, this has led to some odd behaviour on
arm64:

  * syslog now picks up the "pr_emerg" line from dump_backtrace, but not
    the actual trace, which leads to a bunch of "kernel:Call trace:"
    lines in the log

  * __{pte,pmd,pgd}_error print at KERN_CRIT, as opposed to KERN_ERR
    which is used by other architectures.

This patch restores the original printk behaviour for dump_backtrace
and downgrade the pgtable error macros to KERN_ERR.

Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-12-21 17:26:02 +00:00
AKASHI Takahiro
20380bb390 arm64: ftrace: fix a stack tracer's output under function graph tracer
Function graph tracer modifies a return address (LR) in a stack frame
to hook a function return. This will result in many useless entries
(return_to_handler) showing up in
 a) a stack tracer's output
 b) perf call graph (with perf record -g)
 c) dump_backtrace (at panic et al.)

For example, in case of a),
  $ echo function_graph > /sys/kernel/debug/tracing/current_tracer
  $ echo 1 > /proc/sys/kernel/stack_trace_enabled
  $ cat /sys/kernel/debug/tracing/stack_trace
        Depth    Size   Location    (54 entries)
        -----    ----   --------
  0)     4504      16   gic_raise_softirq+0x28/0x150
  1)     4488      80   smp_cross_call+0x38/0xb8
  2)     4408      48   return_to_handler+0x0/0x40
  3)     4360      32   return_to_handler+0x0/0x40
  ...

In case of b),
  $ echo function_graph > /sys/kernel/debug/tracing/current_tracer
  $ perf record -e mem:XXX:x -ag -- sleep 10
  $ perf report
                  ...
                  |          |          |--0.22%-- 0x550f8
                  |          |          |          0x10888
                  |          |          |          el0_svc_naked
                  |          |          |          sys_openat
                  |          |          |          return_to_handler
                  |          |          |          return_to_handler
                  ...

In case of c),
  $ echo function_graph > /sys/kernel/debug/tracing/current_tracer
  $ echo c > /proc/sysrq-trigger
  ...
  Call trace:
  [<ffffffc00044d3ac>] sysrq_handle_crash+0x24/0x30
  [<ffffffc000092250>] return_to_handler+0x0/0x40
  [<ffffffc000092250>] return_to_handler+0x0/0x40
  ...

This patch replaces such entries with real addresses preserved in
current->ret_stack[] at unwind_frame(). This way, we can cover all
the cases.

Reviewed-by: Jungseok Lee <jungseoklee85@gmail.com>
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
[will: fixed minor context changes conflicting with irq stack bits]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-12-21 17:26:02 +00:00
AKASHI Takahiro
fe13f95b72 arm64: pass a task parameter to unwind_frame()
Function graph tracer modifies a return address (LR) in a stack frame
to hook a function's return. This will result in many useless entries
(return_to_handler) showing up in a call stack list.
We will fix this problem in a later patch ("arm64: ftrace: fix a stack
tracer's output under function graph tracer"). But since real return
addresses are saved in ret_stack[] array in struct task_struct,
unwind functions need to be notified of, in addition to a stack pointer
address, which task is being traced in order to find out real return
addresses.

This patch extends unwind functions' interfaces by adding an extra
argument of a pointer to task_struct.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-12-21 17:26:01 +00:00
AKASHI Takahiro
79fdee9b63 arm64: ftrace: modify a stack frame in a safe way
Function graph tracer modifies a return address (LR) in a stack frame by
calling ftrace_prepare_return() in a traced function's function prologue.
The current code does this modification before preserving an original
address at ftrace_push_return_trace() and there is always a small window
of inconsistency when an interrupt occurs.

This doesn't matter, as far as an interrupt stack is introduced, because
stack tracer won't be invoked in an interrupt context. But it would be
better to proactively minimize such a window by moving the LR modification
after ftrace_push_return_trace().

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-12-21 17:26:01 +00:00
James Morse
d224a69e3d arm64: remove irq_count and do_softirq_own_stack()
sysrq_handle_reboot() re-enables interrupts while on the irq stack. The
irq_stack implementation wrongly assumed this would only ever happen
via the softirq path, allowing it to update irq_count late, in
do_softirq_own_stack().

This means if an irq occurs in sysrq_handle_reboot(), during
emergency_restart() the stack will be corrupted, as irq_count wasn't
updated.

Lose the optimisation, and instead of moving the adding/subtracting of
irq_count into irq_stack_entry/irq_stack_exit, remove it, and compare
sp_el0 (struct thread_info) with sp & ~(THREAD_SIZE - 1). This tells us
if we are on a task stack, if so, we can safely switch to the irq stack.
Finally, remove do_softirq_own_stack(), we don't need it anymore.

Reported-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
[will: use get_thread_info macro]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-12-21 17:26:01 +00:00
David Woods
66b3923a1a arm64: hugetlb: add support for PTE contiguous bit
The arm64 MMU supports a Contiguous bit which is a hint that the TTE
is one of a set of contiguous entries which can be cached in a single
TLB entry.  Supporting this bit adds new intermediate huge page sizes.

The set of huge page sizes available depends on the base page size.
Without using contiguous pages the huge page sizes are as follows.

 4KB:   2MB  1GB
64KB: 512MB

With a 4KB granule, the contiguous bit groups together sets of 16 pages
and with a 64KB granule it groups sets of 32 pages.  This enables two new
huge page sizes in each case, so that the full set of available sizes
is as follows.

 4KB:  64KB   2MB  32MB  1GB
64KB:   2MB 512MB  16GB

If a 16KB granule is used then the contiguous bit groups 128 pages
at the PTE level and 32 pages at the PMD level.

If the base page size is set to 64KB then 2MB pages are enabled by
default.  It is possible in the future to make 2MB the default huge
page size for both 4KB and 64KB granules.

Reviewed-by: Chris Metcalf <cmetcalf@ezchip.com>
Reviewed-by: Steve Capper <steve.capper@linaro.org>
Signed-off-by: David Woods <dwoods@ezchip.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-12-21 17:26:00 +00:00
Andy Lutomirski
30bfa7b348 x86/entry: Restore traditional SYSENTER calling convention
It turns out that some Android versions hardcode the SYSENTER
calling convention.  This is buggy and will cause problems no
matter what the kernel does.  Nonetheless, we should try to
support it.

Credit goes to Linus for pointing out a clean way to handle
the SYSENTER/SYSCALL clobber differences while preserving
straightforward DWARF annotations.

I believe that the original offending Android commit was:

https://android.googlesource.com/platform%2Fbionic/+/7dc3684d7a2587e43e6d2a8e0e3f39bf759bd535

Reported-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-and-tested-by: Borislav Petkov <bp@alien8.de>
Cc: <mark.gross@intel.com>
Cc: Su Tao <tao.su@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: <frank.wang@intel.com>
Cc: <borun.fu@intel.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Mingwei Shi <mingwei.shi@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-12-21 16:05:01 +01:00
Andy Lutomirski
6a613ac6bc x86/entry: Fix some comments
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-and-tested-by: Borislav Petkov <bp@alien8.de>
Cc: <mark.gross@intel.com>
Cc: Su Tao <tao.su@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: <qiuxu.zhuo@intel.com>
Cc: <frank.wang@intel.com>
Cc: <borun.fu@intel.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Mingwei Shi <mingwei.shi@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-12-21 16:05:01 +01:00
Lorenzo Pieralisi
60792ad349 arm64: kernel: enforce pmuserenr_el0 initialization and restore
The pmuserenr_el0 register value is architecturally UNKNOWN on reset.
Current kernel code resets that register value iff the core pmu device is
correctly probed in the kernel. On platforms with missing DT pmu nodes (or
disabled perf events in the kernel), the pmu is not probed, therefore the
pmuserenr_el0 register is not reset in the kernel, which means that its
value retains the reset value that is architecturally UNKNOWN (system
may run with eg pmuserenr_el0 == 0x1, which means that PMU counters access
is available at EL0, which must be disallowed).

This patch adds code that resets pmuserenr_el0 on cold boot and restores
it on core resume from shutdown, so that the pmuserenr_el0 setup is
always enforced in the kernel.

Cc: <stable@vger.kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-12-21 14:43:04 +00:00
Stefano Stabellini
187b26a972 xen/x86: convert remaining timespec to timespec64 in xen_pvclock_gtod_notify
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2015-12-21 14:40:59 +00:00
Stefano Stabellini
7609686313 xen/x86: support XENPF_settime64
Try XENPF_settime64 first, if it is not available fall back to
XENPF_settime32.

No need to call __current_kernel_time() when all the info needed are
already passed via the struct timekeeper * argument.

Return NOTIFY_BAD in case of errors.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2015-12-21 14:40:59 +00:00
Stefano Stabellini
7d5f6f81dd xen/arm: set the system time in Xen via the XENPF_settime64 hypercall
If Linux is running as dom0, call XENPF_settime64 to update the system
time in Xen on pvclock_gtod notifications.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
2015-12-21 14:40:58 +00:00
Stefano Stabellini
e709fba132 xen/arm: introduce xen_read_wallclock
Read the wallclock from the shared info page at boot time.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
2015-12-21 14:40:58 +00:00
Stefano Stabellini
ab76078a3d arm: extend pvclock_wall_clock with sec_hi
The hypervisor actually exposes an additional field to struct
pvclock_wall_clock, with the high 32 bit seconds.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Julien Grall <julien.grall@citrix.com>
2015-12-21 14:40:57 +00:00
Stefano Stabellini
f3d6027ee0 xen: introduce XENPF_settime64
Rename the current XENPF_settime hypercall and related struct to
XENPF_settime32.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2015-12-21 14:40:57 +00:00
Stefano Stabellini
72d39c691b xen/arm: introduce HYPERVISOR_platform_op on arm and arm64
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2015-12-21 14:40:56 +00:00
Stefano Stabellini
cfafae9403 xen: rename dom0_op to platform_op
The dom0_op hypercall has been renamed to platform_op since Xen 3.2,
which is ancient, and modern upstream Linux kernels cannot run as dom0
and it anymore anyway.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2015-12-21 14:40:55 +00:00
Stefano Stabellini
34e38523d5 xen/arm: account for stolen ticks
Register the runstate_memory_area with the hypervisor.
Use pv_time_ops.steal_clock to account for stolen ticks.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2015-12-21 14:40:55 +00:00
Stefano Stabellini
dfd57bc3a5 arm64: introduce CONFIG_PARAVIRT, PARAVIRT_TIME_ACCOUNTING and pv_time_ops
Introduce CONFIG_PARAVIRT and PARAVIRT_TIME_ACCOUNTING on ARM64.
Necessary duplication of paravirt.h and paravirt.c with ARM.

The only paravirt interface supported is pv_time_ops.steal_clock, so no
runtime pvops patching needed.

This allows us to make use of steal_account_process_tick for stolen
ticks accounting.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
2015-12-21 14:40:54 +00:00
Stefano Stabellini
02c2433b3a arm: introduce CONFIG_PARAVIRT, PARAVIRT_TIME_ACCOUNTING and pv_time_ops
Introduce CONFIG_PARAVIRT and PARAVIRT_TIME_ACCOUNTING on ARM.

The only paravirt interface supported is pv_time_ops.steal_clock, so no
runtime pvops patching needed.

This allows us to make use of steal_account_process_tick for stolen
ticks accounting.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Christopher Covington <cov@codeaurora.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Russell King <linux@arm.linux.org.uk>
2015-12-21 14:40:54 +00:00
Stefano Stabellini
4ccefbe597 xen: move xen_setup_runstate_info and get_runstate_snapshot to drivers/xen/time.c
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2015-12-21 14:40:52 +00:00
Helge Deller
71a71fb537 parisc: Fix syscall restarts
On parisc syscalls which are interrupted by signals sometimes failed to
restart and instead returned -ENOSYS which in the worst case lead to
userspace crashes.
A similiar problem existed on MIPS and was fixed by commit e967ef02
("MIPS: Fix restart of indirect syscalls").

On parisc the current syscall restart code assumes that all syscall
callers load the syscall number in the delay slot of the ble
instruction. That's how it is e.g. done in the unistd.h header file:
	ble 0x100(%sr2, %r0)
	ldi #syscall_nr, %r20
Because of that assumption the current code never restored %r20 before
returning to userspace.

This assumption is at least not true for code which uses the glibc
syscall() function, which instead uses this syntax:
	ble 0x100(%sr2, %r0)
	copy regX, %r20
where regX depend on how the compiler optimizes the code and register
usage.

This patch fixes this problem by adding code to analyze how the syscall
number is loaded in the delay branch and - if needed - copy the syscall
number to regX prior returning to userspace for the syscall restart.

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
2015-12-21 10:16:18 +01:00
Vineet Gupta
6b538db7c6 ARC: dw2 unwind: Catch Dwarf SNAFUs early
Instead of seeing empty stack traces, let kernel fail early so dwarf
issues can be fixed sooner

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-12-21 14:01:49 +05:30
Vineet Gupta
6d0d506012 ARC: dw2 unwind: Don't bail for CIE.version != 1
The rudimentary CIE.version == 3 handling is already present in code
(for return address register specification)

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-12-21 14:01:25 +05:30
Vineet Gupta
2d64affc92 Revert "ARC: dw2 unwind: Ignore CIE version !=1 gracefully instead of bailing"
Blingly ignoring CIE.version != 1 was a bad idea.
It still leaves "desirability" when running perf with callgraphing where libgcc
symbols might show in hotspot.

More importantly, basic CIE.version == 3 support already exists in code:

|
|   retAddrReg = state.version <= 1 ? *ptr++ : get_uleb128(&ptr, end);
|

Next commit with simply add continue-not-bail for CIE.version != 1

This reverts commit 323f41f9e7.
2015-12-21 13:29:44 +05:30
Vineet Gupta
07fd7d4bbc ARC: Fix linking errors with CONFIG_MODULE + CONFIG_CC_OPTIMIZE_FOR_SIZE
At -Os, ARC gcc generates millicode thunk for function prologue/epilogue,
which are served by libgcc.

Modules historically are NOT linked with libgcc to avoid code bloat, reducing
runtime relocation fixups etc. I even once tried doing that but got lost
in makefile intricacies.

This means modules at -Os don't get the millicode thunks, causing build
failures below:

| MODPOST 5 modules
| ERROR: "__ld_r13_to_r18" [crypto/sha256_generic.ko] undefined!
| ERROR: "__ld_r13_to_r18_ret" [crypto/sha256_generic.ko] undefined!
| ERROR: "__st_r13_to_r18" [crypto/sha256_generic.ko] undefined!
| ERROR: "__ld_r13_to_r17_ret" [crypto/sha256_generic.ko] undefined!
| ERROR: "__st_r13_to_r17" [crypto/sha256_generic.ko] undefined!
| ERROR: "__ld_r13_to_r16_ret" [crypto/sha256_generic.ko] undefined!
| ERROR: "__st_r13_to_r16" [crypto/sha256_generic.ko] undefined!
|....
|....

Workaround that by inhibiting millicode thunks for loadable modules

Fixes STAR 9000641864:
("Linux built with optimizations for size emits errors for modules")

Reported-by: Anton Kolesov <akolesov@synosys.com>
Cc: Michal Marek <mmarek@suse.cz>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-12-21 13:01:19 +05:30
Alexey Brodkin
4b32e89af7 ARC: mm: fix building for MMU v2
ARC700 cores with MMU v2 don't have IC_PTAG AUX register and so we only
define ARC_REG_IC_PTAG for MMU versions >= 3.

But current implementation of cache_line_loop_vX() routines assumes
availability of all of them (v2, v3 and v4) simultaneously.

And given undefined ARC_REG_IC_PTAG if CONFIG_MMU_VER=2 we're seeing
compilation problem:
---------------------------------->8-------------------------------
  CC      arch/arc/mm/cache.o
arch/arc/mm/cache.c: In function '__cache_line_loop_v3':
arch/arc/mm/cache.c:270:13: error: 'ARC_REG_IC_PTAG' undeclared (first use in this function)
   aux_tag = ARC_REG_IC_PTAG;
             ^
arch/arc/mm/cache.c:270:13: note: each undeclared identifier is reported only once for each function it appears in
scripts/Makefile.build:258: recipe for target 'arch/arc/mm/cache.o' failed
---------------------------------->8-------------------------------

The simples fix is to have ARC_REG_IC_PTAG defined regardless MMU
version being used.

We don't use it in cache_line_loop_v2() anyways so who cares.

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-12-21 12:10:40 +05:30
Vineet Gupta
899cfd2bb0 ARC: mm: HIGHMEM: Fix section mismatch splat
| WARNING: vmlinux.o(.text+0xd6c2): Section mismatch in reference from the function alloc_kmap_pgtable() to the function
| .init.text:__alloc_bootmem_low()
The function alloc_kmap_pgtable() references the function __init __alloc_bootmem_low().
This is often because alloc_kmap_pgtable lacks a __init annotation or the annotation of __alloc_bootmem_low is wrong.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-12-21 12:10:40 +05:30
Jake Oshins
c8f3e518d3 x86/irq: Export functions to allow MSI domains in modules
The Linux kernel already has the concept of IRQ domain, wherein a
component can expose a set of IRQs which are managed by a particular
interrupt controller chip or other subsystem. The PCI driver exposes
the notion of an IRQ domain for Message-Signaled Interrupts (MSI) from
PCI Express devices. This patch exposes the functions which are
necessary for creating a MSI IRQ domain within a module.

[ tglx: Split it into x86 and core irq parts ]

Signed-off-by: Jake Oshins <jakeo@microsoft.com>
Cc: gregkh@linuxfoundation.org
Cc: kys@microsoft.com
Cc: devel@linuxdriverproject.org
Cc: olaf@aepfle.de
Cc: apw@canonical.com
Cc: vkuznets@redhat.com
Cc: haiyangz@microsoft.com
Cc: marc.zyngier@arm.com
Cc: bhelgaas@google.com
Link: http://lkml.kernel.org/r/1449769983-12948-4-git-send-email-jakeo@microsoft.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-12-20 12:40:49 +01:00
35b3154efb powerpc fixes for 4.4 #4
- Partial revert of "powerpc: Individual System V IPC system calls"
  - pr_warn_once on unsupported OPAL_MSG type from Stewart
  - Fix deadlock in opal-irqchip introduced by "Fix double endian conversion" from Alistair
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWdTe9AAoJEFHr6jzI4aWANO4QAJwqS2Zhga0v/JSqMUpKgrlM
 aFMLp6nM25wHkFmRBbIgJbkQAqG+2Sl43OARuJ4y4b3J9fqjKBXxzymgPQ+Y1OlR
 xd+psLmfRf5r/cge45UgILhxE5LFckVIVg/uYkeI5zfRq9TqYes1Ys+7nGpV7IdB
 zWXseulscE/KcEbDHlBexN9/FujONZk6DU6m17TzJbkiptn+7CA0AahbWsK9t05g
 jXCppDPvGYvYGYQ4Y0G8Qnp3jELDlmPhwYWGLw7gruGTHSfbMQxhFokembk/HDcx
 tSetmTBzGt384h7dVJD6HF89VuwgqECBIL8hl0cajFkjkzgdIieDuEGWM1yov4+R
 7Tv05aO/5xYUy5vTk5qkMfywH+TQOwVjr3p3KZgGdj8ddYu7Sk4rRwKK/4cN1M7W
 /RrRzeSOJ2RkTedzu1/sd0h49r2o7tUtgC0rKosDBkibCk135yNa3pe7FKHyR8NW
 a/B57u6+wFvf374TUVMavcSeRvIa4cQ3YuMPcM8ykrEUJWB+QVEXPNiTNG5rux4Z
 4+VX9n0/LhsEu+dGMrdWkpEUSCXi4p6AQHKpTDocWSfEiGNU77b1vF7va06T34fN
 nbp6O0dXFnTgX6IJ8jEQzc8bELElFNw0gfgH6vHpgh6dkFEyL/ew0mJqQJ17bJLF
 HUxdt4/OgNVj5htfeKDI
 =58wA
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.4-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
 - Partial revert of "powerpc: Individual System V IPC system calls"
 - pr_warn_once on unsupported OPAL_MSG type from Stewart
 - Fix deadlock in opal-irqchip introduced by "Fix double endian
   conversion" from Alistair

* tag 'powerpc-4.4-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/opal-irqchip: Fix deadlock introduced by "Fix double endian conversion"
  powerpc/powernv: pr_warn_once on unsupported OPAL_MSG type
  Partial revert of "powerpc: Individual System V IPC system calls"
2015-12-19 16:40:48 -08:00
David Vrabel
d8c98a1d14 x86/paravirt: Prevent rtc_cmos platform device init on PV guests
Adding the rtc platform device in non-privileged Xen PV guests causes
an IRQ conflict because these guests do not have legacy PIC and may
allocate irqs in the legacy range.

In a single VCPU Xen PV guest we should have:

/proc/interrupts:
           CPU0
  0:       4934  xen-percpu-virq      timer0
  1:          0  xen-percpu-ipi       spinlock0
  2:          0  xen-percpu-ipi       resched0
  3:          0  xen-percpu-ipi       callfunc0
  4:          0  xen-percpu-virq      debug0
  5:          0  xen-percpu-ipi       callfuncsingle0
  6:          0  xen-percpu-ipi       irqwork0
  7:        321   xen-dyn-event     xenbus
  8:         90   xen-dyn-event     hvc_console
  ...

But hvc_console cannot get its interrupt because it is already in use
by rtc0 and the console does not work.

  genirq: Flags mismatch irq 8. 00000000 (hvc_console) vs. 00000000 (rtc0)

We can avoid this problem by realizing that unprivileged PV guests (both
Xen and lguests) are not supposed to have rtc_cmos device and so
adding it is not necessary.

Privileged guests (i.e. Xen's dom0) do use it but they should not have
irq conflicts since they allocate irqs above legacy range (above
gsi_top, in fact).

Instead of explicitly testing whether the guest is privileged we can
extend pv_info structure to include information about guest's RTC
support.

Reported-and-tested-by: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: vkuznets@redhat.com
Cc: xen-devel@lists.xenproject.org
Cc: konrad.wilk@oracle.com
Cc: stable@vger.kernel.org # 4.2+
Link: http://lkml.kernel.org/r/1449842873-2613-1-git-send-email-boris.ostrovsky@oracle.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-12-19 21:35:13 +01:00
Michael Neuling
c395465da6 powerpc: Add function to copy mm_context_t to the paca
This adds a function to copy the mm->context to the paca.  This is
only a basic conversion for now but will be used more extensively in
the next patch.

This also adds #ifdef CONFIG_PPC_BOOK3S around this code since it's
not used elsewhere.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-19 22:13:12 +11:00
Borislav Petkov
4baf7fe407 x86/mm: Align macro defines
Bring PAGE_{SHIFT,SIZE,MASK} to the same indentation level as the rest
of the header.

Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1449480268-26583-1-git-send-email-bp@alien8.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-12-19 11:53:40 +01:00
Borislav Petkov
6e1315fe82 x86/cpu: Provide a config option to disable static_cpu_has
This brings .text savings of about ~1.6K when building a tinyconfig. It
is off by default so nothing changes for the default.

Kconfig help text from Josh.

Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Link: http://lkml.kernel.org/r/1449481182-27541-5-git-send-email-bp@alien8.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-12-19 11:49:55 +01:00
Borislav Petkov
362f924b64 x86/cpufeature: Remove unused and seldomly used cpu_has_xx macros
Those are stupid and code should use static_cpu_has_safe() or
boot_cpu_has() instead. Kill the least used and unused ones.

The remaining ones need more careful inspection before a conversion can
happen. On the TODO.

Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1449481182-27541-4-git-send-email-bp@alien8.de
Cc: David Sterba <dsterba@suse.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Chris Mason <clm@fb.com>
Cc: Josef Bacik <jbacik@fb.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-12-19 11:49:55 +01:00
Borislav Petkov
39c06df4dc x86/cpufeature: Cleanup get_cpu_cap()
Add an enum for the ->x86_capability array indices and cleanup
get_cpu_cap() by killing some redundant local vars.

Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1449481182-27541-3-git-send-email-bp@alien8.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-12-19 11:49:54 +01:00
Borislav Petkov
2ccd71f1b2 x86/cpufeature: Move some of the scattered feature bits to x86_capability
Turn the CPUID leafs which are proper CPUID feature bit leafs into
separate ->x86_capability words.

Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1449481182-27541-2-git-send-email-bp@alien8.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-12-19 11:49:53 +01:00
Thomas Gleixner
0fa85119cd Merge branch 'linus' into x86/cleanups
Pull in upstream changes so we can apply depending patches.
2015-12-19 11:49:13 +01:00
Hidehiro Kawai
b279d67df8 x86/nmi: Save regs in crash dump on external NMI
Now, multiple CPUs can receive an external NMI simultaneously by
specifying the "apic_extnmi=all" command line parameter. When we take
a crash dump by using external NMI with this option, we fail to save
registers into the crash dump. This happens as follows:

  CPU 0                              CPU 1
  ================================   =============================
  receive an external NMI
  default_do_nmi()                   receive an external NMI
    spin_lock(&nmi_reason_lock)      default_do_nmi()
    io_check_error()                   spin_lock(&nmi_reason_lock)
      panic()                            busy loop
      ...
        kdump_nmi_shootdown_cpus()
          issue NMI IPI -----------> blocked until IRET
                                         busy loop...

Here, since CPU 1 is in NMI context, an additional NMI from CPU 0
remains unhandled until CPU 1 IRETs. However, CPU 1 will never execute
IRET so the NMI is not handled and the callback function to save
registers is never called.

To solve this issue, we check if the IPI for crash dumping was issued
while waiting for nmi_reason_lock to be released, and if so, call its
callback function directly. If the IPI is not issued (e.g. kdump is
disabled), the actual behavior doesn't change.

Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: kexec@lists.infradead.org
Cc: linux-doc@vger.kernel.org
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stefan Lippers-Hollmann <s.l-h@gmx.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: x86-ml <x86@kernel.org>
Link: http://lkml.kernel.org/r/20151210065245.4587.39316.stgit@softrs
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-12-19 11:07:01 +01:00
Hidehiro Kawai
b7c4948e98 x86/apic: Introduce apic_extnmi command line parameter
This patch introduces a command line parameter apic_extnmi:

 apic_extnmi=( bsp|all|none )

The default value is "bsp" and this is the current behavior: only the
Boot-Strapping Processor receives an external NMI.

"all" allows external NMIs to be broadcast to all CPUs. This would
raise the success rate of panic on NMI when BSP hangs in NMI context
or the external NMI is swallowed by other NMI handlers on the BSP.

If you specify "none", no CPUs receive external NMIs. This is useful for
the dump capture kernel so that it cannot be shot down by accidentally
pressing the external NMI button (on platforms which have it) while
saving a crash dump.

Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Bandan Das <bsd@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: kexec@lists.infradead.org
Cc: linux-doc@vger.kernel.org
Cc: "Maciej W. Rozycki" <macro@linux-mips.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: x86-ml <x86@kernel.org>
Link: http://lkml.kernel.org/r/20151210014632.25437.43778.stgit@softrs
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-12-19 11:07:01 +01:00
Hidehiro Kawai
58c5661f21 panic, x86: Allow CPUs to save registers even if looping in NMI context
Currently, kdump_nmi_shootdown_cpus(), a subroutine of crash_kexec(),
sends an NMI IPI to CPUs which haven't called panic() to stop them,
save their register information and do some cleanups for crash dumping.
However, if such a CPU is infinitely looping in NMI context, we fail to
save its register information into the crash dump.

For example, this can happen when unknown NMIs are broadcast to all
CPUs as follows:

  CPU 0                             CPU 1
  ===========================       ==========================
  receive an unknown NMI
  unknown_nmi_error()
    panic()                         receive an unknown NMI
      spin_trylock(&panic_lock)     unknown_nmi_error()
      crash_kexec()                   panic()
                                        spin_trylock(&panic_lock)
                                        panic_smp_self_stop()
                                          infinite loop
        kdump_nmi_shootdown_cpus()
          issue NMI IPI -----------> blocked until IRET
                                          infinite loop...

Here, since CPU 1 is in NMI context, the second NMI from CPU 0 is
blocked until CPU 1 executes IRET. However, CPU 1 never executes IRET,
so the NMI is not handled and the callback function to save registers is
never called.

In practice, this can happen on some servers which broadcast NMIs to all
CPUs when the NMI button is pushed.

To save registers in this case, we need to:

  a) Return from NMI handler instead of looping infinitely
  or
  b) Call the callback function directly from the infinite loop

Inherently, a) is risky because NMI is also used to prevent corrupted
data from being propagated to devices.  So, we chose b).

This patch does the following:

1. Move the infinite looping of CPUs which haven't called panic() in NMI
   context (actually done by panic_smp_self_stop()) outside of panic() to
   enable us to refer pt_regs. Please note that panic_smp_self_stop() is
   still used for normal context.

2. Call a callback of kdump_nmi_shootdown_cpus() directly to save
   registers and do some cleanups after setting waiting_for_crash_ipi which
   is used for counting down the number of CPUs which handled the callback

Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Aaron Tomlin <atomlin@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: David Hildenbrand <dahi@linux.vnet.ibm.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Gobinda Charan Maji <gobinda.cemk07@gmail.com>
Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Javi Merino <javi.merino@arm.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: kexec@lists.infradead.org
Cc: linux-doc@vger.kernel.org
Cc: lkml <linux-kernel@vger.kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Stefan Lippers-Hollmann <s.l-h@gmx.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ulrich Obergfell <uobergfe@redhat.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Link: http://lkml.kernel.org/r/20151210014628.25437.75256.stgit@softrs
[ Cleanup comments, fixup formatting. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-12-19 11:07:01 +01:00
Hidehiro Kawai
1717f2096b panic, x86: Fix re-entrance problem due to panic on NMI
If panic on NMI happens just after panic() on the same CPU, panic() is
recursively called. Kernel stalls, as a result, after failing to acquire
panic_lock.

To avoid this problem, don't call panic() in NMI context if we've
already entered panic().

For that, introduce nmi_panic() macro to reduce code duplication. In
the case of panic on NMI, don't return from NMI handlers if another CPU
already panicked.

Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Aaron Tomlin <atomlin@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: David Hildenbrand <dahi@linux.vnet.ibm.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Gobinda Charan Maji <gobinda.cemk07@gmail.com>
Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Javi Merino <javi.merino@arm.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: kexec@lists.infradead.org
Cc: linux-doc@vger.kernel.org
Cc: lkml <linux-kernel@vger.kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ulrich Obergfell <uobergfe@redhat.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Link: http://lkml.kernel.org/r/20151210014626.25437.13302.stgit@softrs
[ Cleanup comments, fixup formatting. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-12-19 11:07:00 +01:00
Thomas Gleixner
d267b8d6c6 Merge branch 'linus' into x86/apic
Pull in update changes so we can apply conflicting patches
2015-12-19 11:03:18 +01:00
Boris Ostrovsky
91e2eea98f x86/xen: Avoid fast syscall path for Xen PV guests
After 32-bit syscall rewrite, and specifically after commit:

  5f310f739b ("x86/entry/32: Re-implement SYSENTER using the new C path")

... the stack frame that is passed to xen_sysexit is no longer a
"standard" one (i.e. it's not pt_regs).

Since we end up calling xen_iret from xen_sysexit we don't need
to fix up the stack and instead follow entry_SYSENTER_32's IRET
path directly to xen_iret.

We can do the same thing for compat mode even though stack does
not need to be fixed. This will allow us to drop usergs_sysret32
paravirt op (in the subsequent patch)

Suggested-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: david.vrabel@citrix.com
Cc: konrad.wilk@oracle.com
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1447970147-1733-2-git-send-email-boris.ostrovsky@oracle.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-12-19 09:55:52 +01:00
Ashok Raj
d90167a941 x86/mce: Ensure offline CPUs don't participate in rendezvous process
Intel's MCA implementation broadcasts MCEs to all CPUs on the
node. This poses a problem for offlined CPUs which cannot
participate in the rendezvous process:

  Kernel panic - not syncing: Timeout: Not all CPUs entered broadcast exception handler
  Kernel Offset: disabled
  Rebooting in 100 seconds..

More specifically, Linux does a soft offline of a CPU when
writing a 0 to /sys/devices/system/cpu/cpuX/online, which
doesn't prevent the #MC exception from being broadcasted to that
CPU.

Ensure that offline CPUs don't participate in the MCE rendezvous
and clear the RIP valid status bit so that a second MCE won't
cause a shutdown.

Without the patch, mce_start() will increment mce_callin and
wait for all CPUs. Offlined CPUs should avoid participating in
the rendezvous process altogether.

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
[ Massage commit message. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Cc: <stable@vger.kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1449742346-21470-2-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-12-19 09:55:31 +01:00
Tony Lindgren
0b4d6972d7 ARM: dts: Fix UART wakeirq for omap4 duovero parlor
Looks like we're missing the wakeirq for the console uart for
duovero parlor. Let's add that as without it console acess just
hangs with PM enabled.

Cc: Arun Bharadwaj <arun@gumstix.com>
Cc: Ash Charles <ash@gumstix.com>
Cc: Florian Vaussard <florian.vaussard@epfl.ch>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2015-12-18 16:43:22 -08:00
Boris BREZILLON
038a5380e3 cris: nand: remove useless mtd->priv = chip assignments
mtd_to_nand() now uses the container_of() approach to transform an
mtd_info pointer into a nand_chip one. Drop useless mtd->priv
assignments from NAND controller drivers.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
2015-12-18 13:22:11 -08:00
Daniel Borkmann
606c88a86c bpf, x86: detect/optimize loading 0 immediates
When sometimes structs or variables need to be initialized/'memset' to 0 in
an eBPF C program, the x86 BPF JIT converts this to use immediates. We can
however save a couple of bytes (f.e. even up to 7 bytes on a single emmission
of BPF_LD | BPF_IMM | BPF_DW) in the image by detecting such case and use xor
on the dst register instead.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-18 16:04:51 -05:00
Daniel Borkmann
8b614aebec bpf: move clearing of A/X into classic to eBPF migration prologue
Back in the days where eBPF (or back then "internal BPF" ;->) was not
exposed to user space, and only the classic BPF programs internally
translated into eBPF programs, we missed the fact that for classic BPF
A and X needed to be cleared. It was fixed back then via 83d5b7ef99
("net: filter: initialize A and X registers"), and thus classic BPF
specifics were added to the eBPF interpreter core to work around it.

This added some confusion for JIT developers later on that take the
eBPF interpreter code as an example for deriving their JIT. F.e. in
f75298f5c3 ("s390/bpf: clear correct BPF accumulator register"), at
least X could leak stack memory. Furthermore, since this is only needed
for classic BPF translations and not for eBPF (verifier takes care
that read access to regs cannot be done uninitialized), more complexity
is added to JITs as they need to determine whether they deal with
migrations or native eBPF where they can just omit clearing A/X in
their prologue and thus reduce image size a bit, see f.e. cde66c2d88
("s390/bpf: Only clear A and X for converted BPF programs"). In other
cases (x86, arm64), A and X is being cleared in the prologue also for
eBPF case, which is unnecessary.

Lets move this into the BPF migration in bpf_convert_filter() where it
actually belongs as long as the number of eBPF JITs are still few. It
can thus be done generically; allowing us to remove the quirk from
__bpf_prog_run() and to slightly reduce JIT image size in case of eBPF,
while reducing code duplication on this matter in current(/future) eBPF
JITs.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Tested-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Cc: Zi Shen Lim <zlim.lnx@gmail.com>
Cc: Yang Shi <yang.shi@linaro.org>
Acked-by: Yang Shi <yang.shi@linaro.org>
Acked-by: Zi Shen Lim <zlim.lnx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-18 16:04:51 -05:00
Boris BREZILLON
4883090bed cris: nand: use the mtd instance embedded in struct nand_chip
struct nand_chip now embeds an mtd device. Patch all drivers to make use
of this mtd instance instead of using the instance embedded in their
private struct or dynamically allocated.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
2015-12-18 12:44:57 -08:00
3273cba195 xen: bug fixes for 4.4-rc5
- XSA-155 security fixes to backend drivers.
 - XSA-157 security fixes to pciback.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJWdDrXAAoJEFxbo/MsZsTR3N0H/0Lvz6MWBARCje7livbz7nqE
 PS0Bea+2yAfNhCDDiDlpV0lor8qlyfWDF6lGhLjItldAzahag3ZDKDf1Z/lcQvhf
 3MwFOcOVZE8lLtvLT6LGnPuehi1Mfdi1Qk1/zQhPhsq6+FLPLT2y+whmBihp8mMh
 C12f7KRg5r3U7eZXNB6MEtGA0RFrOp0lBdvsiZx3qyVLpezj9mIe0NueQqwY3QCS
 xQ0fILp/x2EnZNZuzgghFTPRxMAx5ReOezgn9Rzvq4aThD+irz1y6ghkYN4rG2s2
 tyYOTqBnjJEJEQ+wmYMhnfCwVvDffztG+uI9hqN31QFJiNB0xsjSWFCkDAWchiU=
 =Argz
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-4.4-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen bug fixes from David Vrabel:
 - XSA-155 security fixes to backend drivers.
 - XSA-157 security fixes to pciback.

* tag 'for-linus-4.4-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen-pciback: fix up cleanup path when alloc fails
  xen/pciback: Don't allow MSI-X ops if PCI_COMMAND_MEMORY is not set.
  xen/pciback: For XEN_PCI_OP_disable_msi[|x] only disable if device has MSI(X) enabled.
  xen/pciback: Do not install an IRQ handler for MSI interrupts.
  xen/pciback: Return error on XEN_PCI_OP_enable_msix when device has MSI or MSI-X enabled
  xen/pciback: Return error on XEN_PCI_OP_enable_msi when device has MSI or MSI-X enabled
  xen/pciback: Save xen_pci_op commands before processing it
  xen-scsiback: safely copy requests
  xen-blkback: read from indirect descriptors only once
  xen-blkback: only read request operation from shared ring once
  xen-netback: use RING_COPY_REQUEST() throughout
  xen-netback: don't use last request to determine minimum Tx credit
  xen: Add RING_COPY_REQUEST()
  xen/x86/pvh: Use HVM's flush_tlb_others op
  xen: Resume PMU from non-atomic context
  xen/events/fifo: Consume unprocessed events when a CPU dies
2015-12-18 12:24:52 -08:00
83ad283f6b ARC Fixes
- perf interrupts on SMP: Not enabled (at boot) and disabled (at runtime)
  - stack unwinder regression (for modules, ignoring dwarf3)
  - nsim hosed for non default kernel link base builds
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWc/ceAAoJEGnX8d3iisJeenQP/jdQPxtDU5CYURuqpS1Xb7hN
 vEBCdj/2g8aNxF5/KSsUzSnnH5DJkm4I6fqo+/dqr9J32NrsU9skvO20BMSckMYV
 cLhrmsVLqo9DLGj23Gjl1427o0cLIloQwrE0hfgYGz2ceMl8CwTjt89+ZHPNZ2WB
 2m0q4pFtwHY5ZZBSh+kmN0OCZIVLB3Jydp0+V18DQtRFop22tkypYtiQ1DP3NysU
 /1w5EJopjGSqRbtQSzahFtXEwAzhM2UfrIYSZs0iXSGDTkggH4ouJs817g0kHqqL
 MOSvwlEk0Yz2jTr+PwHX1PSvIRfrJHi2x9U2nfrB9Q2flXV/XU3ZX1/k3lDV1mh2
 hQjwXsCDyQEhAvVqVtLZqxHbrFumlOW1rd8tfl49MYCqRJy/TOYXyxYzk1M0zUMy
 pgUHjEBrK0hnpnycxOLQV6XrZaYEDWFH88Ke+vxI4lI5pGwarQLPgCfd2kCX/0jG
 miACD2EK97fdvr7b9UATJ++UUqhwOwYZ+OCw4vJhiDwBAOVgPtLKnrTETPqqqPBt
 72uwGxCjjHI6a9foDDcf41BbgktqInUvyf/bHOJOnEiXT+U1SN/4+1E5sbgL+W6q
 5kDGhaWxc81EEmrwI4kFtPa3JdwWiJc2R1JU5Gkng/DHrVUFEHsEpcYqVgciqNqm
 c+84ML+VzkxIKnVgQvqM
 =+aYb
 -----END PGP SIGNATURE-----

Merge tag 'arc-fixes-for-4.4-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc

Pull ARC architecture fixes from Vineet Gupta:
 "Fixes for:

 - perf interrupts on SMP: Not enabled (at boot) and disabled (at runtime)
 - stack unwinder regression (for modules, ignoring dwarf3)
 - nsim hosed for non default kernel link base builds"

* tag 'arc-fixes-for-4.4-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  ARC: smp: Rename platform hook @init_cpu_smp -> @init_per_cpu
  ARC: rename smp operation init_irq_cpu() to init_per_cpu()
  ARC: dw2 unwind: Ignore CIE version !=1 gracefully instead of bailing
  ARC: dw2 unwind: Reinstante unwinding out of modules
  ARC: [plat-sim] unbork non default CONFIG_LINUX_LINK_BASE
  ARC: intc: Document arc_request_percpu_irq() better
  ARCv2: perf: Ensure perf intr gets enabled on all cores
  ARC: intc: No need to clear IRQ_NOAUTOEN
  ARCv2: intc: Fix random perf irq disabling in SMP setup
  ARC: [axs10x] cap ethernet phy to 100 Mbit/sec
2015-12-18 12:19:01 -08:00
Takuya Yoshikawa
774926641d KVM: x86: MMU: Use clear_page() instead of init_shadow_page_table()
Not just in order to clean up the code, but to make it faster by using
enhanced instructions: the initialization became 20-30% faster on our
testing machine.

Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-18 19:07:45 +01:00
Peter Oberparleiter
2ab59de7c5 s390/cio: Consolidate inline assemblies and related data definitions
Replace the current semi-arbitrary distribution of inline assemblies:
 - Inline assemblies used by CIO go into ioasm.h
 - Data definitions used by inline assemblies go into cio.h

Beyond cleaning up the current structure this is also required for
use of tracepoints in inline assemblies introduced by a follow-on
patch.

Signed-off-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Acked-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-12-18 14:59:34 +01:00
Christian Borntraeger
d7ae65aec0 s390/setup: cleanup machine flags
Over time some machine flags got unused (e.g. MACHINE_FLAG_MVPG)
or are available on all 64bit systems (MACHINE_FLAG_CSP,
MACHINE_FLAG_IEEE) - let's remove them.
Reorder the other ones to match the order of the MACHINE_HAS_*
macros and renumber all bits to avoid holes.
Also fix the comment about where the flags are detected.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-12-18 14:59:32 +01:00
Heiko Carstens
3dbc78d3a1 s390/smp: save timestamp on external calls
This is supposed to make debugging easier: if within a dump we can see
that an external call or emergency signal IPI is pending but all cpus
are idle, we have no idea for how long the interrupt is outstanding.

Therefore save a timestamp into the per cpu pcpu array of the target
cpu whenever such an IPI is sent.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-12-18 14:59:31 +01:00
Christian Borntraeger
e527aec434 s390/sysinfo: Remove unused variables
max_mnest and rc are never used.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-12-18 14:59:30 +01:00
Christian Borntraeger
90f405e859 s390/extmem: remove unused variable
findseg_scode is assigned, but never used.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-12-18 14:59:29 +01:00
Christian Borntraeger
292d8d7157 s390/fault: remove unused variable
address is assigned but never used.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-12-18 14:59:28 +01:00
Christian Borntraeger
f9101e639c s390/traps: Remove unused variable
location is assigned but never used.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-12-18 14:59:27 +01:00
Heiko Carstens
6527f6e0cb s390: compile head.S always with -march=z900
head.S on s390 contains some sanity checks if the kernel will run on a
machine or if the machine is too old, e.g. if the kernel contains
instructions not available on the machine. If so, it will emit an error
message to the console before it stops execution.

Therefore head.S contains only instructions which are availanble with the
earliest machine generation (z900).  In order to make sure we don't
accidently add instructions which are not available on z900, always compile
with -march=z900. This makes sure compilation will fail if wrong
instructions are used.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-12-18 14:59:26 +01:00
Heiko Carstens
f3d05f9de7 s390/facilities: add z13 als bit
If configured for z13 assume the kernel makes use of the instructions
that are part of the load-and-zero-rightmost-byte facility and
load/store-on-condition facility 2.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-12-18 14:59:24 +01:00
Heiko Carstens
6ab6c31a54 s390/facilities: optimize test_facility()
test_facility() can be optimized for bits which must be set anyway,
due to the check in head.S. This removes a couple of superfluous
runtime checks.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-12-18 14:59:23 +01:00
Heiko Carstens
185edd4431 s390/facilities: remove unneeded facility bits
The facility lists contain a lot of bits which are not necessary to
run the kernel.  Therefore remove them and keep only those bits which
are required for the kernel.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-12-18 14:59:22 +01:00
Heiko Carstens
0358ecf732 s390/facilities: make use of generated facility list
Change head.S to make use of the generated facility list.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-12-18 14:59:21 +01:00
Heiko Carstens
c30f6828fe s390/facilities: add helper tool to generate facility lists
Modifying the architecture level set facility lists was always very
error prone. Given the numbering of the facility bits within the
Principles of Operation, where the most significant bit number is 0,
it happened a lot of times that wrong bits were set or cleared.

Therefore this patch adds a tool "gen_facilities" which generates
include/generated/facilites.h.  The definition of the bits to be set
is contained within arch/s390/include/asm/facilities_src.h and can be
easily extended to e.g. also generate such lists for the KVM module.

The generated file looks like this:

 #define FACILITIES_ALS _AC(0xc1006450f0040000,UL)
 #define FACILITIES_ALS_DWORDS 1

The facility bits defined in this patch match 1:1 to the current masks
that can be found in head.S.

That is if the tool gets executed with -march=z990 then the generated
masks will equal the masks in head.S for CONFIG_MARCH_Z990.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-12-18 14:59:20 +01:00
Heiko Carstens
76cdd44c2e s390/facilities: always use lowcore's stfle field for storing facility bits
head.s contains an stfle instruction which stores it result at the
storage location that is assigned to the stfl instruction.

This is currently no problem, since we only care about one double
word. However if the number of double words in the ALS bitfield grows
the current code is not very stable.

E.g. before issuing the stfle command the memory to which it stores
must be cleared, since the instruction may or may not clear memory
contents where no bits are set.

In order to simplify the code a bit always use the storage location
that we reserved for the stfle result.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-12-18 14:59:19 +01:00
Heiko Carstens
9552a66fe6 s390/facilities: use stfl mnemonic instead of insn magic
Now that 31 bit support is gone, the assembler always knows about the
stfl instruction. Therefore lets use a readable mnemonic.  Also remove
the not needed extable entry for the inline assembly and fix the
output constraint.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-12-18 14:59:18 +01:00
Michael Holzheu
272fa59ccb s390/dis: Fix handling of format specifiers
The print_insn() function returns strings like "lghi %r1,0". To escape the
'%' character in sprintf() a second '%' is used. For example "lghi %%r1,0"
is converted into "lghi %r1,0".

After print_insn() the output string is passed to printk(). Because format
specifiers like "%r" or "%f" are ignored by printk() this works by chance
most of the time. But for instructions with control registers like
"lctl %c6,%c6,780" this fails because printk() interprets "%c" as
character format specifier.

Fix this problem and escape the '%' characters twice.

For example "lctl %%%%c6,%%%%c6,780" is then converted by sprintf()
into "lctl %%c6,%%c6,780" and by printk() into "lctl %c6,%c6,780".

Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-12-18 14:43:21 +01:00
Pavel Fedin
c7da6fa43c arm/arm64: KVM: Detect vGIC presence at runtime
Before commit 662d971584
("arm/arm64: KVM: Kill CONFIG_KVM_ARM_{VGIC,TIMER}") is was possible to
compile the kernel without vGIC and vTimer support. Commit message says
about possibility to detect vGIC support in runtime, but this has never
been implemented.

This patch introduces runtime check, restoring the lost functionality.
It again allows to use KVM on hardware without vGIC. Interrupt
controller has to be emulated in userspace in this case.

-ENODEV return code from probe function means there's no GIC at all.
-ENXIO happens when, for example, there is GIC node in the device tree,
but it does not specify vGIC resources. Any other error code is still
treated as full stop because it might mean some really serious problems.

Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-12-18 12:01:58 +00:00
Alistair Popple
036592fbbe powerpc/opal-irqchip: Fix deadlock introduced by "Fix double endian conversion"
Commit 25642e1459 ("powerpc/opal-irqchip: Fix double endian
conversion") fixed an endian bug by calling opal_handle_events() in
opal_event_unmask().

However this introduced a deadlock if we find an event is active
during unmasking and call opal_handle_events() again. The bad call
sequence is:

  opal_interrupt()
  -> opal_handle_events()
     -> generic_handle_irq()
        -> handle_level_irq()
           -> raw_spin_lock(&desc->lock)
              handle_irq_event(desc)
              unmask_irq(desc)
              -> opal_event_unmask()
                 -> opal_handle_events()
                    -> generic_handle_irq()
                       -> handle_level_irq()
                          -> raw_spin_lock(&desc->lock)	(BOOM)

When generating multiple opal events in quick succession this would lead
to the following stall warnings:

EEH: Fenced PHB#0 detected, location: U78C9.001.WZS09XA-P1-C32
INFO: rcu_sched detected stalls on CPUs/tasks:

         12-...: (1 GPs behind) idle=68f/140000000000001/0 softirq=860/861 fqs=2065
         15-...: (1 GPs behind) idle=be5/140000000000001/0 softirq=1142/1143 fqs=2065
         (detected by 13, t=2102 jiffies, g=1325, c=1324, q=602)
NMI watchdog: BUG: soft lockup - CPU#18 stuck for 22s! [irqbalance:2696]
INFO: rcu_sched detected stalls on CPUs/tasks:
         12-...: (1 GPs behind) idle=68f/140000000000001/0 softirq=860/861 fqs=8371
         15-...: (1 GPs behind) idle=be5/140000000000001/0 softirq=1142/1143 fqs=8371
         (detected by 20, t=8407 jiffies, g=1325, c=1324, q=1290)

This patch corrects the problem by queuing the work if an event is
active during unmasking, which is similar to the pre-endian fix
behaviour.

Fixes: 25642e1459 ("powerpc/opal-irqchip: Fix double endian conversion")
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Reported-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-18 22:24:15 +11:00
Vladimir Murzin
20475f784d arm64: KVM: Add support for 16-bit VMID
The ARMv8.1 architecture extension allows to choose between 8-bit and
16-bit of VMID, so use this capability for KVM.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-12-18 10:15:12 +00:00
Vladimir Murzin
8420dcd37e arm: KVM: Make kvm_arm.h friendly to assembly code
kvm_arm.h is included from both C code and assembly code; however some
definitions in this header supplied with U/UL/ULL suffixes which might
confuse assembly once they got evaluated.
We have _AC macro for such cases, so just wrap problem places with it.

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-12-18 10:15:11 +00:00
Vladimir Murzin
9d4dc68834 arm/arm64: KVM: Remove unreferenced S2_PGD_ORDER
Since commit a987370 ("arm64: KVM: Fix stage-2 PGD allocation to have
per-page refcounting") there is no reference to S2_PGD_ORDER, so kill it
for the good.

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-12-18 10:15:11 +00:00
Marc Zyngier
281243cbe0 arm64: KVM: debug: Remove spurious inline attributes
The debug trapping code is pretty heavy on the "inline" attribute,
but most functions are actually referenced in the sysreg tables,
making the inlining imposible.

Removing the useless inline qualifier seems the right thing to do,
having verified that the output code is similar.

Cc: Alex Bennée <alex.bennee@linaro.org>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-12-18 10:15:11 +00:00
Marc Zyngier
e078ef8151 ARM: KVM: Cleanup exception injection
David Binderman reported that the exception injection code had a
couple of unused variables lingering around.

Upon examination, it looked like this code could do with an
anticipated spring cleaning, which amounts to deduplicating
the CPSR/SPSR update, and making it look a bit more like
the architecture spec.

The spurious variables are removed in the process.

Reported-by: David Binderman <dcb314@hotmail.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-12-18 10:14:54 +00:00
d7637d01be Merge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm fixes from Dan Williams:

 - Two bug fixes for misuse of PAGE_MASK in scatterlist and dma-debug.
   These are tagged for -stable.  The scatterlist impact is potentially
  corrupted dma addresses on HIGHMEM enabled platforms.

 - A minor locking fix for the NFIT hot-add implementation that is new
   in 4.4-rc.  This would only trigger in the case a hot-add raced
   driver removal.

* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  dma-debug: Fix dma_debug_entry offset calculation
  Revert "scatterlist: use sg_phys()"
  nfit: acpi_nfit_notify(): Do not leave device locked
2015-12-17 11:20:13 -08:00
Felipe Balbi
54011103fb ARM: OMAP2+: AM43xx: select ARM TWD timer
Make sure to tell the kernel that AM437x devices have ARM TWD timer.

Signed-off-by: Felipe Balbi <balbi@ti.com>
[grygorii.strashko@ti.com: drop ARM Global timer selection, because
 it's incompatible with PM (cpuidle/cpufreq). So, it's unsafe to enable
 it unconditionally]
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2015-12-17 10:54:13 -08:00
Grygorii Strashko
0b3e6fca4d ARM: OMAP2+: am43xx: enable GENERIC_CLOCKEVENTS_BROADCAST
System will misbehave in the following case:
- AM43XX only build (UP);
- CONFIG_CPU_IDLE=y
- ARM TWD timer enabled and selected as clockevent device.

In the above case, It's expected that broadcast timer will be used as
backup timer when CPUIdle will put MPU in low power states where ARM
TWD will stop and lose its context. But, the CONFIG_SMP might not be
selected when kernel is built for AM43XX SoC only and, as result,
GENERIC_CLOCKEVENTS_BROADCAST option will not be selected also. This
will break CPUIdle and System will stuck in low power states.

Hence, fix it by selecting GENERIC_CLOCKEVENTS_BROADCAST option for
AM43XX SoCs always and add empty tick_broadcast() function
implementation - no need to send any IPI on UP. After this change
timer1 will be selected as broadcast timer the same way as for SMP,
and CPUIdle will work properly.

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2015-12-17 10:53:28 -08:00
Daniel Axtens
1b855e167b powerpc: Add missing calls to va_end()
cppcheck picked up that there were a couple of missing va_end()
calls in functions using va_start().

Signed-off-by: Daniel Axtens <dja@axtens.net>
Reviewed-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 23:23:22 +11:00
Nathan Fontenot
e9d764f803 powerpc/pseries: Enable kernel CPU dlpar from sysfs
Enable new kernel cpu hotplug functionality by allowing cpu dlpar requests
to be initiated from sysfs.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:41:03 +11:00
Nathan Fontenot
90edf184b9 powerpc/pseries: Add CPU dlpar add functionality
Add the ability to hotplug add cpus via rtas hotplug events by either
specifying the drc index of the CPU to add, or providing a count of the
number of CPUs to add.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:41:02 +11:00
Nathan Fontenot
ac71380071 powerpc/pseries: Add CPU dlpar remove functionality
Add the ability to dlpar remove CPUs via hotplug rtas events, either by
specifying the drc-index of the CPU to remove or providing a count of cpus
to remove.

To remove multiple cpus in a single request we create a list of possible
DR (Dynamic Reconfiguration) cpus and their drc indexes that can be
removed.  We can then traverse the list remove each cpu and easily clean
up in any cases of failure.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:41:02 +11:00
Nathan Fontenot
e666ae0b10 powerpc/pseries: Update CPU hotplug error recovery
Update the cpu dlpar add/remove paths to do better error recovery when
a failure occurs during the add/remove operation.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:41:02 +11:00
Nathan Fontenot
d98389f375 powerpc/pseries: Factor out common cpu hotplug code
Re-factor the cpu hotplug code to support doing cpu hotplug completely in
the kernel and using the existing sysfs probe/release interfaces. This
patch pulls out pieces of existing cpu hotplug code into common routines,
dlpar_cpu_add() and dlpar_cpu_remove(), to be used by both interfaces.
There are no functional changes introduced.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:41:01 +11:00
Nathan Fontenot
183deeea58 powerpc/pseries: Consolidate CPU hotplug code to hotplug-cpu.c
No functional changes, this patch is simply a move of the cpu hotplug
code from pseries/dlpar.c to pseries/hotplug-cpu.c. This is in an effort
to consolidate all of the cpu hotplug code in a common place.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:41:01 +11:00
Nathan Fontenot
1f859adb92 powerpc/pseries: Verify CPU doesn't exist before adding
When DLPAR adding a CPU we should verify that the CPU does not already
exist. Failure to do so can generate a kernel oops;

[    9.465585] kernel BUG at arch/powerpc/platforms/pseries/dlpar.c:382!
[    9.465796] Oops: Exception in kernel mode, sig: 5 [#1]

This oops can be generated by causing a probe to be performed on a cpu
by writing to the sysfs cpu probe file (/sys/devices/system/cpu/probe).
This patch adds a check for the existence of cpu prior to probing the cpu
so userspace doing the wrong thing won't trigger a BUG_ON().

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:41:01 +11:00
Alistair Popple
4450022b49 powerpc/476fpe: Add support for kexec
PPC476FPE has a different PVR from previous PPC476 processors. The
kexec code checks the PVR in order to correctly setup the MMU. When
the initial support for 476FPE processors was added the corresponding
change in the kexec code was missed. This patch simply adds the check
and solves the following bug on kexec:

kexec: Starting new kernel
Bye!
Unable to handle kernel paging request for instruction fetch
Faulting instruction address: 0xee9a50f8
cpu 0x0: Vector: 400 (Instruction Access) at [ee9d7d20]
    pc: ee9a50f8
    lr: ee9a50e4
    sp: ee9d7dd0
    msr: 21020
    current = 0xee40f000
    pid   = 960, comm = kexec
enter ? for help
[link register   ] ee9a50e4
[ee9d7dd0] c0013748 default_machine_kexec+0x58/0x70 (unreliable)
[ee9d7df0] c0012f04 machine_kexec+0x34/0x40
[ee9d7e00] c00aa1ec kernel_kexec+0x9c/0xb0
[ee9d7e20] c005d704 SyS_reboot+0x1f4/0x220
[ee9d7f40] c000db68 ret_from_syscall+0x0/0x3c

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:41:00 +11:00
Alistair Popple
5d2aa710e6 powerpc/powernv: Add support for Nvlink NPUs
NVLink is a high speed interconnect that is used in conjunction with a
PCI-E connection to create an interface between CPU and GPU that
provides very high data bandwidth. A PCI-E connection to a GPU is used
as the control path to initiate and report status of large data
transfers sent via the NVLink.

On IBM Power systems the NVLink processing unit (NPU) is similar to
the existing PHB3. This patch adds support for a new NPU PHB type. DMA
operations on the NPU are not supported as this patch sets the TCE
translation tables to be the same as the related GPU PCIe device for
each NVLink. Therefore all DMA operations are setup and controlled via
the PCIe device.

EEH is not presently supported for the NPU devices, although it may be
added in future.

Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:41:00 +11:00
Alistair Popple
a84bf32140 powerpc: Add __raw_rm_writeq() function
Move __raw_rm_writeq() from platforms/powernv/pci-ioda.c to
include/asm/io.h so that it can be used by other code.

Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:40:59 +11:00
Alistair Popple
94973b24d6 Revert "powerpc/pci: Remove unused struct pci_dn.pcidev field"
This commit removed the pcidev field from struct pci_dn as it was no
longer in use by the kernel. However to support finding the
association of Nvlink devices to GPU devices from the device-tree this
field is required.

This reverts commit 250c7b277c ("powerpc/pci: Remove unused struct
pci_dn.pcidev field").

Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:40:59 +11:00
Gavin Shan
e80c4e7ca5 powerpc/powernv: Fix M64 resource name in /proc/iomem
The name of PCI root bus's M64 resource isn't initialized properly.
When dumping "/proc/iomem", "<BAD>" is seen for those M64 resources
on PCI root buses.

   ~# cat /proc/iomem | grep -e "BAD"
   3b0000000000-3b0fefffffff : <BAD>
   3b1000000000-3b1fefffffff : <BAD>
   3c0000000000-3c0fefffffff : <BAD>
   3c1000000000-3c1fefffffff : <BAD>
   3c2000000000-3c2fefffffff : <BAD>

This fixes the issue by setting the name of PCI root bus's M64
resource to that of PHB's device node full name. With the patch,
no "<BAD>" is seen from "/proc/iomem".

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:40:59 +11:00
Laurent Dufour
7207f43665 powerpc/mm: Add page soft dirty tracking
User space checkpoint and restart tool (CRIU) needs the page's change
to be soft tracked. This allows to do a pre checkpoint and then dump
only touched pages.

This is done by using a newly assigned PTE bit (_PAGE_SOFT_DIRTY) when
the page is backed in memory, and a new _PAGE_SWP_SOFT_DIRTY bit when
the page is swapped out.

To introduce a new PTE _PAGE_SOFT_DIRTY bit value common to hash 4k
and hash 64k pte, the bits already defined in hash-*4k.h should be
shifted left by one.

The _PAGE_SWP_SOFT_DIRTY bit is dynamically put after the swap type in
the swap pte. A check is added to ensure that the bit is not
overwritten by _PAGE_HPTEFLAGS.

Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
CC: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:40:58 +11:00
Michael Ellerman
2613265cb5 powerpc/kernel: Combine vec/loc for STD_EXCEPTION_PSERIES
The STD_EXCEPTION_PSERIES macro takes both a vector number, and a
location (memory address). However both are always identical, so combine
them to save repeating ourselves.

This does mean an exception handler must always exist at the location in
memory that matches its vector number. But that's OK because this is the
"STD" macro (standard), which does exactly that. We have other macros
for the other cases, eg. STD_EXCEPTION_PSERIES_OOL (out of line).

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:40:58 +11:00
Michael Ellerman
d8725ce86c powerpc/kernel: Open code SET_DEFAULT_THREAD_PPR
This is only used in one location, open code it.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:40:57 +11:00
Michael Ellerman
d030a4b5eb powerpc/kernel: Open code HMT_MEDIUM_LOW_HAS_PPR
HMT_MEDIUM_LOW_HAS_PPR is only used in once place, open code it.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:40:57 +11:00
Michael Ellerman
d6265aeaf8 powerpc/kernel: Drop HMT_MEDIUM_PPR_DISCARD
HMT_MEDIUM_PPR_DISCARD is a macro which is present at the start of most
of our first level exception handlers. It conditionally executes a
HMT_MEDIUM instruction, which sets the processor priority to medium.

On on modern systems, ie. Power7 and later, it is nop'ed out at boot.
All it does is make the exception vectors more cramped, and consume 4
bytes of icache.

On old systems it has the effect of boosting the processor priority at
the start of exception processing. If we were previously in the idle
loop for example, we may be at low or very low priority. This is
desirable as we want to process the exception as fast as possible.

However looking closely at the generated code, we see that in all cases
we execute another HMT_MEDIUM just four instructions later. With code
patching applied, the final code on an old (Power6) system will look
like, eg:

  c000000000000300 <data_access_pSeries>:
  c000000000000300:	7c 42 13 78	mr	r2,r2		<-
  c000000000000304:	7d b2 43 a6	mtsprg	2,r13
  c000000000000308:	7d b1 42 a6	mfsprg	r13,1
  c00000000000030c:	f9 2d 00 80	std	r9,128(r13)
  c000000000000310:	60 00 00 00	nop
  c000000000000314:	7c 42 13 78	mr	r2,r2		<-

So I suggest that the added code complexity of HMT_MEDIUM_PPR_DISCARD is
not justified by the benefit of boosting the processor priority for the
duration of four instructions, and therefore we drop it.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:40:57 +11:00
Michael Ellerman
cd5cdeb6c8 powerpc/rtas: Make enter_rtas() private
There are no longer any users of enter_rtas() outside of rtas.c, so make
it "private", by moving the declaration inside rtas.c. Hopefully this
will encourage people to use one of the wrappers which takes the sharp
edges off the RTAS calling sequence.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:40:56 +11:00
Michael Ellerman
4456f45246 powerpc/rtas: Use rtas_call_unlocked() in call_rtas_display_status()
Although call_rtas_display_status() does actually want to use the
regular RTAS locking, it doesn't want the extra logic that is in
rtas_call(), so currently it open codes the logic.

Instead we can use rtas_call_unlocked(), after taking the RTAS lock.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:40:56 +11:00
Michael Ellerman
b2e8590fa1 powerpc/pseries: Use rtas_call_unlocked() in pseries hotplug
Avoid open coding the logic by using rtas_call_unlocked().

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:40:55 +11:00
Michael Ellerman
08eb105a7c powerpc/xmon: Use rtas_call_unlocked() in xmon
Avoid open coding the logic by using rtas_call_unlocked().

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:40:55 +11:00
Michael Ellerman
209eb4e5cb powerpc/rtas: Add rtas_call_unlocked()
Most users of RTAS (Run-Time Abstraction Services) use rtas_call(),
which deals with locking as well as endian handling.

However we have two users outside of rtas.c that can't use rtas_call()
because they have different locking requirements.

The hotplug CPU code can't take the RTAS lock because the CPU would go
offline with the lock held and no other CPUs would be able to call RTAS
until the CPU came back online.

The xmon code doesn't want to take the lock because it would risk dead
locking when we are trying to recover from a crash.

Both sites required multiple patches when we added little endian
support, proving that programmers can't do endian right.

Although that ship has sailed, we can still clean the code up by
providing an unlocked version of rtas_call() which avoids the need to
open code the logic elsewhere.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:40:55 +11:00
Stewart Smith
e4d54f71d2 powerpc/powernv: remove FW_FEATURE_OPALv3 and just use FW_FEATURE_OPAL
Long ago, only in the lab, there was OPALv1 and OPALv2. Now there is
just OPALv3, with nobody ever expecting anything on pre-OPALv3 to
be cared about or supported by mainline kernels.

So, let's remove FW_FEATURE_OPALv3 and instead use FW_FEATURE_OPAL
exclusively.

Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:40:54 +11:00
Stewart Smith
7261aafc09 powerpc/powernv: Remove OPALv2 firmware define and references
OPALv2 only ever existed in the lab and didn't escape to the world.
All OPAL systems in the wild are OPALv3.

The probability of there being an OPALv2 system still powered on
anywhere inside IBM is approximately zero, let alone anyone
expecting to run mainline kernels.

So, start to remove references to OPALv2.

Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:40:54 +11:00
Stewart Smith
786842b62f powerpc/powernv: panic() on OPAL < V3
The OpenPower Abstraction Layer firmware went through a couple
of iterations in the lab before being released. What we now know
as OPAL advertises itself as OPALv3.

OPALv2 and OPALv1 never made it outside the lab, and the possibility
of anyone at all ever building a mainline kernel today and expecting
it to boot on such hardware is zero.

Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:40:53 +11:00
Ashok Kumar
0a28714c53 arm64: Use PoU cache instr for I/D coherency
In systems with three levels of cache(PoU at L1 and PoC at L3),
PoC cache flush instructions flushes L2 and L3 caches which could affect
performance.
For cache flushes for I and D coherency, PoU should suffice.
So changing all I and D coherency related cache flushes to PoU.

Introduced a new __clean_dcache_area_pou API for dcache flush till PoU
and provided a common macro for __flush_dcache_area and
__clean_dcache_area_pou.

Also, now in __sync_icache_dcache, icache invalidation for non-aliasing
VIPT icache is done only for that particular page instead of the earlier
__flush_icache_all.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ashok Kumar <ashoks@broadcom.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-12-17 11:07:13 +00:00
Ashok Kumar
e6b1185f77 arm64: Defer dcache flush in __cpu_copy_user_page
Defer dcache flushing to __sync_icache_dcache by calling
flush_dcache_page which clears PG_dcache_clean flag.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ashok Kumar <ashoks@broadcom.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-12-17 11:07:13 +00:00
Nicolas Pitre
42f25bddd0 ARM: 8477/1: runtime patch udiv/sdiv instructions into __aeabi_{u}idiv()
The ARM compiler inserts calls to __aeabi_idiv() and
__aeabi_uidiv() when it needs to perform division on signed and
unsigned integers. If a processor has support for the sdiv and
udiv instructions, the kernel may overwrite the beginning of those
functions with those instructions and a "bx lr" to get better
performance.

To ensure that those functions are aligned to a 32-bit word for easier
patching (which might not always be the case in Thumb mode) and that
the two patched instructions end up in the same cache line, a 8-byte
alignment is enforced when ARM_PATCH_IDIV is selected.

This was heavily inspired by a previous patch from Stephen Boyd.

Signed-off-by: Nicolas Pitre <nico@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-12-17 10:29:01 +00:00
Prasanna Karthik
38fc2f6c98 ARM: 8476/1: VDSO: use PTR_ERR_OR_ZERO for vma check
Use PTR_ERR_OR_ZERO rather than if(IS_ERR(...)) + PTR_ERR

Signed-off-by: Prasanna Karthik <mkarthi3@visteon.com>
Signed-off-by: Nathan Lynch <nathan_lynch@mentor.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-12-17 10:29:01 +00:00
Nicolas Pitre
b563d06451 ARM: 8453/2: proc-v7.S: don't locate temporary stack space in .text section
The proc-v7.S code uses a small temporary stack to preserve register
content in its setup code. This stack is located in the .text section
which is normally meant to be read-only.

Move that temporary stack to the .bss section and get its address in
a position independent way, similarly to what we do in other parts
of the kernel.

While at it, one comments was updated to reflect reality, and the list
of saved registers in the proc-v7.S case is updated to match the comment
next to it for coherency.

Signed-off-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-12-17 10:29:01 +00:00
Geert Uytterhoeven
0a03668188 sh: sh7734: Correct SCIF type for BRG
The SCIF variant in the sh7734 SoC is not the common "SH-4(A)" variant,
but a derivative with added "Baud Rate Generator for External Clock"
(BRG). Correct the regtype value in platform data to fix this.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
2015-12-17 11:19:08 +01:00
Laurent Pinchart
6441d314b4 sh: Remove sci_ick clock alias
The sh-sci driver falls back to the peripheral clock if the sci_ick
clock doesn't exist. There's thus no need to create an alias for the
peripheral clock named sci_ick.

Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Acked-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
2015-12-17 11:19:05 +01:00
Laurent Pinchart
fa3d39bf25 sh: Rename sci_ick and sci_fck clock to fck
The SCI driver requires a functional clock named "fck" and falls back to
"sci_ick" or "sci_fck" when the "fck" clock doesn't exist. To allow
removal of the fallback code rename the sci_ick and sci_fck clocks to
fck for all SH platforms.

Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Acked-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
2015-12-17 11:19:03 +01:00
Stewart Smith
98da62b716 powerpc/powernv: pr_warn_once on unsupported OPAL_MSG type
When running on newer OPAL firmware that supports sending extra
OPAL_MSG types, we would print a warning on *every* message received.

This could be a problem for kernels that don't support OPAL_MSG_OCC
on machines that are running real close to thermal limits and the
OCC is throttling the chip. For a kernel that is paying attention to
the message queue, we could get these notifications quite often.

Conceivably, future message types could also come fairly often,
and printing that we didn't understand them 10,000 times provides
no further information than printing them once.

Cc: stable@vger.kernel.org
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 20:42:13 +11:00
Haren Myneni
6333ed8f26 crypto: nx-842 - Mask XERS0 bit in return value
NX842 coprocessor sets 3rd bit in CR register with XER[S0] which is
nothing to do with NX request. Since this bit can be set with other
valuable return status, mast this bit.

One of other bits (INITIATED, BUSY or REJECTED) will be returned for
any given NX request.

Signed-off-by: Haren Myneni <haren@us.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-12-17 16:42:12 +08:00
Vineet Gupta
575a9d4e2c ARC: smp: Rename platform hook @init_cpu_smp -> @init_per_cpu
Makes it similar to smp_ops which also has callback with same name

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-12-17 12:56:56 +05:30
Noam Camus
b474a02382 ARC: rename smp operation init_irq_cpu() to init_per_cpu()
This will better reflect its description i.e. "any needed setup..."
and not just do an "IPI request".

Signed-off-by: Noam Camus <noamc@ezchip.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-12-17 12:56:43 +05:30
Vineet Gupta
323f41f9e7 ARC: dw2 unwind: Ignore CIE version !=1 gracefully instead of bailing
ARC dwarf unwinder only supports CIE version == 1
The boot time dwarf sanitizer (part of binary lookup table constructor)
would simply bail if it saw CIE version == 3, rendering unwinder with a
NULL lookup table.

It seems libgcc linked with kernel does have such entries.

With fallback linear search removed, and a NULL binary lookup table,
unwinder fails to generate any stack trace.

So allow graceful ignoring of unsupported CIE entries.

This problem was initially seen in Alexey's setup (and not mine) as he
was using buildroot built toolchain (libgcc) which doesn't get built with
CFLAGS_FOR_TARGET="-gdwarf-2 which is my default

Fixes STAR 9000985048: "kernel unwinder broken with stock tools"

Fixes: 2e22502c08 ARC: dw2 unwind: Remove falllback linear search thru FDE entries
Reported-by Alexey Brodkin <abrodkin@synopsys.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-12-17 11:10:23 +05:30
Vineet Gupta
bc79c9a721 ARC: dw2 unwind: Reinstante unwinding out of modules
The fix which removed linear searching of dwarf (because binary lookup
data always exists) missed out on the fact that modules don't get the
binary lookup tables info. This caused unwinding out of modules to stop
working.

So add binary lookup header setup (equivalent of eh_frame_hdr setup) to
modules as well.

While at it, confine the header setup to within unwinder code,
reducing one API exposed out of unwinder code.

Fixes: 2e22502c08 ARC: dw2 unwind: Remove falllback linear search thru FDE entries
Cc: <stable@vger.kernel.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-12-17 11:10:23 +05:30
Vineet Gupta
ff1c0b6a79 ARC: [plat-sim] unbork non default CONFIG_LINUX_LINK_BASE
HIGHMEM support bumped the default memory size for nsim platform to 1G.
Thus total memory ended at the very edge of start of peripherals address
space. With linux link base shifted, memory started bleeding into
peripheral space which caused early boot bad_page spew !

Fixes: 29e332261d ("ARC: mm: HIGHMEM: populate high memory from DT")
Reported-by: Anton Kolesov <akolesov@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-12-17 11:06:43 +05:30
a5e90b1b07 Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm
Pull ARM fixes from Russell King:
 "Further ARM fixes:
   - Anson Huang noticed that we were corrupting a register we shouldn't
     be during suspend on some CPUs.
   - Shengjiu Wang spotted a bug in the 'swp' instruction emulation.
   - Will Deacon fixed a bug in the ASID allocator.
   - Laura Abbott fixed the kernel permission protection to apply to all
     threads running in the system.
   - I've fixed two bugs with the domain access control register
     handling, one to do with printing an appropriate value at oops
     time, and the other to further fix the uaccess_with_memcpy code"

* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
  ARM: 8475/1: SWP emulation: Restore original *data when failed
  ARM: 8471/1: need to save/restore arm register(r11) when it is corrupted
  ARM: fix uaccess_with_memcpy() with SW_DOMAIN_PAN
  ARM: report proper DACR value in oops dumps
  ARM: 8464/1: Update all mm structures with section adjustments
  ARM: 8465/1: mm: keep reserved ASIDs in sync with mm after multiple rollovers
2015-12-16 10:57:24 -08:00
Andrey Smetanin
481d2bcc84 kvm/x86: Remove Hyper-V SynIC timer stopping
It's possible that guest send us Hyper-V EOM at the middle
of Hyper-V SynIC timer running, so we start processing of Hyper-V
SynIC timers in vcpu context and stop the Hyper-V SynIC timer
unconditionally:

    host                                       guest
    ------------------------------------------------------------------------------
                                           start periodic stimer
    start periodic timer
    timer expires after 15ms
    send expiration message into guest
    restart periodic timer
    timer expires again after 15 ms
    msg slot is still not cleared so
    setup ->msg_pending
(1) restart periodic timer
                                           process timer msg and clear slot
                                           ->msg_pending was set:
                                               send EOM into host
    received EOM
      kvm_make_request(KVM_REQ_HV_STIMER)

    kvm_hv_process_stimers():
        ...
        stimer_stop()
        if (time_now >= stimer->exp_time)
                stimer_expiration(stimer);

Because the timer was rearmed at (1), time_now < stimer->exp_time
and stimer_expiration is not called.  The timer then never fires.

The patch fixes such situation by not stopping Hyper-V SynIC timer
at all, because it's safe to restart it without stop in vcpu context
and timer callback always returns HRTIMER_NORESTART.

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
CC: Gleb Natapov <gleb@kernel.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Roman Kagan <rkagan@virtuozzo.com>
CC: Denis V. Lunev <den@openvz.org>
CC: qemu-devel@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-16 18:51:22 +01:00
Paolo Bonzini
8a86aea920 KVM: vmx: detect mismatched size in VMCS read/write
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
	I am sending this as RFC because the error messages it produces are
	very ugly.  Because of inlining, the original line is lost.  The
	alternative is to change vmcs_read/write/checkXX into macros, but
	then you need to have a single huge BUILD_BUG_ON or BUILD_BUG_ON_MSG
	because multiple BUILD_BUG_ON* with the same __LINE__ are not
	supported well.
2015-12-16 18:49:47 +01:00
Paolo Bonzini
845c5b4054 KVM: VMX: fix read/write sizes of VMCS fields in dump_vmcs
This was not printing the high parts of several 64-bit fields on
32-bit kernels.  Separate from the previous one to make the patches
easier to review.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-16 18:49:47 +01:00
Paolo Bonzini
f353105463 KVM: VMX: fix read/write sizes of VMCS fields
In theory this should have broken EPT on 32-bit kernels (due to
reading the high part of natural-width field GUEST_CR3).  Not sure
if no one noticed or the processor behaves differently from the
documentation.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-16 18:49:46 +01:00
Li RongQing
0bcf261cc8 KVM: VMX: fix the writing POSTED_INTR_NV
POSTED_INTR_NV is 16bit, should not use 64bit write function

[ 5311.676074] vmwrite error: reg 3 value 0 (err 12)
  [ 5311.680001] CPU: 49 PID: 4240 Comm: qemu-system-i38 Tainted: G I 4.1.13-WR8.0.0.0_standard #1
  [ 5311.689343] Hardware name: Intel Corporation S2600WT2/S2600WT2, BIOS SE5C610.86B.01.01.0008.021120151325 02/11/2015
  [ 5311.699550] 00000000 00000000 e69a7e1c c1950de1 00000000 e69a7e38 fafcff45 fafebd24
  [ 5311.706924] 00000003 00000000 0000000c b6a06dfa e69a7e40 fafcff79 e69a7eb0 fafd5f57
  [ 5311.714296] e69a7ec0 c1080600 00000000 00000001 c0e18018 000001be 00000000 00000b43
  [ 5311.721651] Call Trace:
  [ 5311.722942] [<c1950de1>] dump_stack+0x4b/0x75
  [ 5311.726467] [<fafcff45>] vmwrite_error+0x35/0x40 [kvm_intel]
  [ 5311.731444] [<fafcff79>] vmcs_writel+0x29/0x30 [kvm_intel]
  [ 5311.736228] [<fafd5f57>] vmx_create_vcpu+0x337/0xb90 [kvm_intel]
  [ 5311.741600] [<c1080600>] ? dequeue_task_fair+0x2e0/0xf60
  [ 5311.746197] [<faf3b9ca>] kvm_arch_vcpu_create+0x3a/0x70 [kvm]
  [ 5311.751278] [<faf29e9d>] kvm_vm_ioctl+0x14d/0x640 [kvm]
  [ 5311.755771] [<c1129d44>] ? free_pages_prepare+0x1a4/0x2d0
  [ 5311.760455] [<c13e2842>] ? debug_smp_processor_id+0x12/0x20
  [ 5311.765333] [<c10793be>] ? sched_move_task+0xbe/0x170
  [ 5311.769621] [<c11752b3>] ? kmem_cache_free+0x213/0x230
  [ 5311.774016] [<faf29d50>] ? kvm_set_memory_region+0x60/0x60 [kvm]
  [ 5311.779379] [<c1199fa2>] do_vfs_ioctl+0x2e2/0x500
  [ 5311.783285] [<c11752b3>] ? kmem_cache_free+0x213/0x230
  [ 5311.787677] [<c104dc73>] ? __mmdrop+0x63/0xd0
  [ 5311.791196] [<c104dc73>] ? __mmdrop+0x63/0xd0
  [ 5311.794712] [<c104dc73>] ? __mmdrop+0x63/0xd0
  [ 5311.798234] [<c11a2ed7>] ? __fget+0x57/0x90
  [ 5311.801559] [<c11a2f72>] ? __fget_light+0x22/0x50
  [ 5311.805464] [<c119a240>] SyS_ioctl+0x80/0x90
  [ 5311.808885] [<c1957d30>] sysenter_do_call+0x12/0x12
  [ 5312.059280] kvm: zapping shadow pages for mmio generation wraparound
  [ 5313.678415] kvm [4231]: vcpu0 disabled perfctr wrmsr: 0xc2 data 0xffff
  [ 5313.726518] kvm [4231]: vcpu0 unhandled rdmsr: 0x570

Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Cc: Yang Zhang <yang.z.zhang@Intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-16 18:49:45 +01:00
Andrey Smetanin
1f4b34f825 kvm/x86: Hyper-V SynIC timers
Per Hyper-V specification (and as required by Hyper-V-aware guests),
SynIC provides 4 per-vCPU timers.  Each timer is programmed via a pair
of MSRs, and signals expiration by delivering a special format message
to the configured SynIC message slot and triggering the corresponding
synthetic interrupt.

Note: as implemented by this patch, all periodic timers are "lazy"
(i.e. if the vCPU wasn't scheduled for more than the timer period the
timer events are lost), regardless of the corresponding configuration
MSR.  If deemed necessary, the "catch up" mode (the timer period is
shortened until the timer catches up) will be implemented later.

Changes v2:
* Use remainder to calculate periodic timer expiration time

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
CC: Gleb Natapov <gleb@kernel.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: "K. Y. Srinivasan" <kys@microsoft.com>
CC: Haiyang Zhang <haiyangz@microsoft.com>
CC: Vitaly Kuznetsov <vkuznets@redhat.com>
CC: Roman Kagan <rkagan@virtuozzo.com>
CC: Denis V. Lunev <den@openvz.org>
CC: qemu-devel@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-16 18:49:45 +01:00
Andrey Smetanin
765eaa0f70 kvm/x86: Hyper-V SynIC message slot pending clearing at SINT ack
The SynIC message protocol mandates that the message slot is claimed
by atomically setting message type to something other than HVMSG_NONE.
If another message is to be delivered while the slot is still busy,
message pending flag is asserted to indicate to the guest that the
hypervisor wants to be notified when the slot is released.

To make sure the protocol works regardless of where the message
sources are (kernel or userspace), clear the pending flag on SINT ACK
notification, and let the message sources compete for the slot again.

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
CC: Gleb Natapov <gleb@kernel.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: "K. Y. Srinivasan" <kys@microsoft.com>
CC: Haiyang Zhang <haiyangz@microsoft.com>
CC: Vitaly Kuznetsov <vkuznets@redhat.com>
CC: Roman Kagan <rkagan@virtuozzo.com>
CC: Denis V. Lunev <den@openvz.org>
CC: qemu-devel@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-16 18:49:44 +01:00
Andrey Smetanin
93bf417248 kvm/x86: Hyper-V internal helper to read MSR HV_X64_MSR_TIME_REF_COUNT
This helper will be used also in Hyper-V SynIC timers implementation.

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
CC: Gleb Natapov <gleb@kernel.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: "K. Y. Srinivasan" <kys@microsoft.com>
CC: Haiyang Zhang <haiyangz@microsoft.com>
CC: Vitaly Kuznetsov <vkuznets@redhat.com>
CC: Roman Kagan <rkagan@virtuozzo.com>
CC: Denis V. Lunev <den@openvz.org>
CC: qemu-devel@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-16 18:49:43 +01:00
Andrey Smetanin
0ae80384b2 kvm/x86: Added Hyper-V vcpu_to_hv_vcpu()/hv_vcpu_to_vcpu() helpers
Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
CC: Gleb Natapov <gleb@kernel.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: "K. Y. Srinivasan" <kys@microsoft.com>
CC: Haiyang Zhang <haiyangz@microsoft.com>
CC: Vitaly Kuznetsov <vkuznets@redhat.com>
CC: Roman Kagan <rkagan@virtuozzo.com>
CC: Denis V. Lunev <den@openvz.org>
CC: qemu-devel@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-16 18:49:42 +01:00
Andrey Smetanin
e18eaeed2b kvm/x86: Rearrange func's declarations inside Hyper-V header
This rearrangement places functions declarations together
according to their functionality, so future additions
will be simplier.

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
CC: Gleb Natapov <gleb@kernel.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: "K. Y. Srinivasan" <kys@microsoft.com>
CC: Haiyang Zhang <haiyangz@microsoft.com>
CC: Vitaly Kuznetsov <vkuznets@redhat.com>
CC: Roman Kagan <rkagan@virtuozzo.com>
CC: Denis V. Lunev <den@openvz.org>
CC: qemu-devel@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-16 18:49:42 +01:00
Andrey Smetanin
c71acc4c74 drivers/hv: Move struct hv_timer_message_payload into UAPI Hyper-V x86 header
This struct is required for Hyper-V SynIC timers implementation inside KVM
and for upcoming Hyper-V VMBus support by userspace(QEMU). So place it into
Hyper-V UAPI header.

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
CC: Gleb Natapov <gleb@kernel.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: "K. Y. Srinivasan" <kys@microsoft.com>
CC: Haiyang Zhang <haiyangz@microsoft.com>
CC: Vitaly Kuznetsov <vkuznets@redhat.com>
CC: Roman Kagan <rkagan@virtuozzo.com>
CC: Denis V. Lunev <den@openvz.org>
CC: qemu-devel@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-16 18:49:41 +01:00
Andrey Smetanin
5b423efe11 drivers/hv: Move struct hv_message into UAPI Hyper-V x86 header
This struct is required for Hyper-V SynIC timers implementation inside KVM
and for upcoming Hyper-V VMBus support by userspace(QEMU). So place it into
Hyper-V UAPI header.

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Acked-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
CC: Gleb Natapov <gleb@kernel.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: "K. Y. Srinivasan" <kys@microsoft.com>
CC: Haiyang Zhang <haiyangz@microsoft.com>
CC: Vitaly Kuznetsov <vkuznets@redhat.com>
CC: Roman Kagan <rkagan@virtuozzo.com>
CC: Denis V. Lunev <den@openvz.org>
CC: qemu-devel@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-16 18:49:40 +01:00
Andrey Smetanin
4f39bcfd1c drivers/hv: Move HV_SYNIC_STIMER_COUNT into Hyper-V UAPI x86 header
This constant is required for Hyper-V SynIC timers MSR's
support by userspace(QEMU).

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Acked-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
CC: Gleb Natapov <gleb@kernel.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: "K. Y. Srinivasan" <kys@microsoft.com>
CC: Haiyang Zhang <haiyangz@microsoft.com>
CC: Vitaly Kuznetsov <vkuznets@redhat.com>
CC: Roman Kagan <rkagan@virtuozzo.com>
CC: Denis V. Lunev <den@openvz.org>
CC: qemu-devel@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-16 18:49:40 +01:00
Rafał Miłecki
541c9a84cd ssb: pick SoC invariants code from MIPS BCM47xx arch
There is code in ssb fetching "invariants" that is basically a set of
board specific data. Every host requires its own implementation of
reading function. In ssb we have support for PCI, PCMCIA & SDIO.
For some (historical?) reason code reading "invariants" for SoC was
placed in arch code and provided by a callback. This is not needed
nowadays, so lets move that into ssb. This way we keep all "invariants"
functions in a single module making code cleaner.

Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2015-12-16 16:36:25 +02:00
Michael Ellerman
2475c36213 Partial revert of "powerpc: Individual System V IPC system calls"
This partially reverts commit a34236155a.

While reviewing the glibc patch to exploit the individual IPC calls,
Arnd & Andreas noticed that we were still requiring userspace to pass
IPC_64 in order to get the new style IPC API.

With a bit of cleanup in the kernel we can drop that requirement, and
instead only provide the new style API, which will simplify things for
userspace.

Rather than try and sneak that patch into 4.4, instead we will drop the
individual IPC calls for powerpc, and merge them again in 4.5 once the
cleanup patch has gone in.

Because we've already added sys_mlock2() as syscall #378, we don't do a
full revert of the IPC calls. Instead we drop the __NR #defines, and
send those now undefined syscall numbers to sys_ni_syscall(). This
leaves a gap in the syscall numbers, but we'll reuse them when we merge
the individual IPC calls.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Arnd Bergmann <arnd@arndb.de>
2015-12-16 21:52:32 +11:00
Daniel Axtens
00b912b0c8 powerpc: Remove broken GregorianDay()
GregorianDay() is supposed to calculate the day of the week
(tm->tm_wday) for a given day/month/year. In that calcuation it
indexed into an array called MonthOffset using tm->tm_mon-1. However
tm_mon is zero-based, not one-based, so this is off-by-one. It also
means that every January, GregoiranDay() will access element -1 of
the MonthOffset array.

It also doesn't appear to be a correct algorithm either: see in
contrast kernel/time/timeconv.c's time_to_tm function.

It's been broken forever, which suggests no-one in userland uses
this. It looks like no-one in the kernel uses tm->tm_wday either
(see e.g. drivers/rtc/rtc-ds1305.c:319).

tm->tm_wday is conventionally set to -1 when not available in
hardware so we can simply set it to -1 and drop the function.
(There are over a dozen other drivers in drivers/rtc that do
this.)

Found using UBSAN.

Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andrew Morton <akpm@linux-foundation.org> # as an example of what UBSan finds.
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Cc: rtc-linux@googlegroups.com
Signed-off-by: Daniel Axtens <dja@axtens.net>
Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-16 12:54:04 +11:00
Dan Williams
3e6110fd54 Revert "scatterlist: use sg_phys()"
commit db0fa0cb01 "scatterlist: use sg_phys()" did replacements of
the form:

    phys_addr_t phys = page_to_phys(sg_page(s));
    phys_addr_t phys = sg_phys(s) & PAGE_MASK;

However, this breaks platforms where sizeof(phys_addr_t) >
sizeof(unsigned long).  Revert for 4.3 and 4.4 to make room for a
combined helper in 4.5.

Cc: <stable@vger.kernel.org>
Cc: Jens Axboe <axboe@fb.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Fixes: db0fa0cb01 ("scatterlist: use sg_phys()")
Suggested-by: Joerg Roedel <joro@8bytes.org>
Reported-by: Vitaly Lavrov <vel21ripn@gmail.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-12-15 12:54:06 -08:00
5fab517d33 Wire up mlock2() syscall for ia64
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWbxN3AAoJEKurIx+X31iB3ZoP/R2suDwiD91y7TOu0gat4NsY
 rZPYMwrm4gR3w8TLJLcj1ox099zzYcz1OntLjfaqz09hHtbrwz5BPEDM9s1AOkde
 sW8fZjcDlJbUXXRcUuftoaYy55H6Mukji3M8BF+6iNpPLvhB/8IQww84SA+PPJ0L
 +UCfyYUNk0gIWSKJ0sr7OC6n7yUlNuu/31SO1u28II737riFpW7fYT5/yn6Gf4hL
 Pi3QzQ4Ua4+bjzeiXd6EQvRXAYFE4nDTqbVuEkiBbEJ6WAoepcsYUzEFZJtYduTS
 VG8lo8etvU9OsjZGKVTujX021XyHoU5+SVZ4wU5gmL6NMd+qvRTmidbnzBjRy4jL
 7n9YpUwrfS9A3uUPCQDd9nW52z5W8zJjpA5ZbwexYDYzqFLTAeKA/dDLBpiSUkJf
 vDXWdEWWQ64uxwMUre4+EHn+joapZ+Sx/qM3o125X/uJ1UDR3S+b/ZHIGPVm87P6
 IREjiWkGz/z9vkmHBOAP+1yC7udwadHHlkE6Wew5/F4uK5QRUcqqpQmd6ZepaMK1
 TgXsVJrLBlA3umceWL0AxRVxk67LTT4RqIoh6QoJx9iK84iMsgPQ4sSQ10yd1TMf
 6MQqR7BoueVc9TaljWA+t0Yn9RTaiOP8yl3Px6qvNRZHyIY/9VStf1E6JglOTngV
 +eqklm46BNBiZDf9S3QL
 =JaQp
 -----END PGP SIGNATURE-----

Merge tag 'please-pull-mlock2' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux

Pull ia64 fix from Tony Luck:
 "Wire up mlock2() syscall for ia64"

* tag 'please-pull-mlock2' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux:
  [IA64] Enable mlock2 syscall for ia64
2015-12-15 10:45:29 -08:00
173ae9ba63 Fix user-visible spelling error
Pavel Machek reports a warning about W+X pages found in the "Persisent"
kmap area.  After grepping for it (using the correct spelling), and not
finding it, I noticed how the debug printk was just misspelled.  Fix it.

The actual mapping bug that Pavel reported is still open.  It's
apparently a separate issue from the known EFI page tables, looks like
it's related to the HIGHMEM mappings.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-15 10:15:57 -08:00
James Morse
971c67ce37 arm64: reduce stack use in irq_handler
The code for switching to irq_stack stores three pieces of information on
the stack, fp+lr, as a fake stack frame (that lets us walk back onto the
interrupted tasks stack frame), and the address of the struct pt_regs that
contains the register values from kernel entry. (which dump_backtrace()
will print in any stack trace).

To reduce this, we store fp, and the pointer to the struct pt_regs.
unwind_frame() can recognise this as the irq_stack dummy frame, (as it only
appears at the top of the irq_stack), and use the struct pt_regs values
to find the missing interrupted link-register.

Suggested-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-12-15 17:09:08 +00:00
Guenther Hutzl
32e6b236d2 KVM: s390: consider system MHA for guest storage
Verify that the guest maximum storage address is below the MHA (maximum
host address) value allowed on the host.

Acked-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Guenther Hutzl <hutzl@linux.vnet.ibm.com>
Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
[adopt to match recent limit,size changes]

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2015-12-15 17:08:22 +01:00
Dominik Dingel
a3a92c31bf KVM: s390: fix mismatch between user and in-kernel guest limit
While the userspace interface requests the maximum size the gmap code
expects to get a maximum address.

This error resulted in bigger page tables than necessary for some guest
sizes, e.g. a 2GB guest used 3 levels instead of 2.

At the same time we introduce KVM_S390_NO_MEM_LIMIT, which allows in a
bright future that a guest spans the complete 64 bit address space.

We also switch to TASK_MAX_SIZE for the initial memory size, this is a
cosmetic change as the previous size also resulted in a 4 level pagetable
creation.

Reported-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2015-12-15 17:08:21 +01:00
Christian Borntraeger
8335713ad0 KVM: s390: obey kptr_restrict in traces
The s390dbf and trace events provide a debugfs interface.
If kptr_restrict is active, we should not expose kernel
pointers. We can fence the debugfs output by using %pK
instead of %p.

Cc: Kees Cook <keescook@chromium.org>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2015-12-15 17:06:32 +01:00
Christian Borntraeger
7ec7c8c70b KVM: s390: use assignment instead of memcpy
Replace two memcpy with proper assignment.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2015-12-15 16:06:48 +01:00
Shengjiu Wang
34bfbae33a ARM: 8475/1: SWP emulation: Restore original *data when failed
__user_swpX_asm maybe failed in first STREX operation, emulate_swpX
will try again, but the *data has been changed in first time. which
causes the result is wrong.
This patch is to fix this issue. When STREX succeed, change the *data.
if it fail, *data is not changed.

Signed-off-by: Shengjiu Wang <shengjiu.wang@freescale.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-12-15 11:51:42 +00:00
Anson Huang
fa0708b320 ARM: 8471/1: need to save/restore arm register(r11) when it is corrupted
In cpu_v7_do_suspend routine, r11 is used while it is NOT
saved/restored, different compiler may have different usage
of ARM general registers, so it may cause issues during
calling cpu_v7_do_suspend.

We meet kernel fault occurs when using GCC 4.8.3, r11 contains
valid value before calling into cpu_v7_do_suspend, but when returned
from this routine, r11 is corrupted and lead to kernel fault.
Doing save/restore for those corrupted registers is a must in
assemble code.

Signed-off-by: Anson Huang <Anson.Huang@freescale.com>
Reviewed-by: Nicolas Pitre <nico@linaro.org>
Cc: <stable@vger.kernel.org> # v3.3+
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-12-15 11:51:41 +00:00
Russell King
c014953d84 ARM: fix uaccess_with_memcpy() with SW_DOMAIN_PAN
The uaccess_with_memcpy() code is currently incompatible with the SW
PAN code: it takes locks within the region that we've changed the DACR,
potentially sleeping as a result.  As we do not save and restore the
DACR across co-operative sleep events, can lead to an incorrect DACR
value later in this code path.

Reported-by: Peter Rosin <peda@axentia.se>
Tested-by: Peter Rosin <peda@axentia.se>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-12-15 11:51:02 +00:00
Will Deacon
129b985cc3 Merge branch 'aarch64/efi' into aarch64/for-next/core
Merge in EFI memblock changes from Ard, which form the preparatory work
for UEFI support on 32-bit ARM.
2015-12-15 10:59:03 +00:00
Daniel Lezcano
751605152b h8300: Rename ctlr_out/in[bwl] to raw_read/write[bwl]
For the sake of consistency, let rename all ctrl_out/in calls to the write/read
calls so we have the same API consistent with the other architectures hence
open the door for the increasing of the test compilation coverage.

The unsigned long coercive cast is removed because all variables are set to
the right type "void __iomem *".

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2015-12-15 10:12:03 +01:00
Krzysztof Hałasa
3a35e470bc ARM: dts: imx6: Fix Ethernet PHY mode on Ventana boards
Gateworks Ventana boards seem to need "RGMII-ID" (internal delay)
PHY mode, instead of simple "RGMII", for their Marvell 88E1510
transceiver. Otherwise, the Ethernet MAC doesn't work with Marvell PHY
driver (TX doesn't seem to work correctly).

Tested on GW5400 rev. C.

This bug affects ARM Fedora 23.

Signed-off-by: Krzysztof Hałasa <khalasa@piap.pl>
Acked-by: Tim Harvey <tharvey@gateworks.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2015-12-15 16:53:15 +08:00
Bai Ping
13fdae1ae5 ARM: dts: imx: Fix the assigned-clock mismatch issue on imx6q/dl
The 'assigned-clock-parents' and 'assigned-clock-rates' list
should corresponding to the 'assigned-clocks' property clock list.

Signed-off-by: Bai Ping <b51503@freescale.com>
Fixes: ed339363de ("ARM: dts: imx6qdl-sabreauto: Allow HDMI and LVDS to work simultaneously")
Cc: <stable@vger.kernel.org>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2015-12-15 16:48:56 +08:00
Daniel Lezcano
97a23beb8d clocksource/drivers/h8300_timer8: Separate the Kconfig option from the arch
The current Kconfig option is the H8300 arch option. In order to comply to the
current rule, let's create a specific option for the timer8 and select it
from the arch's Kconfig.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2015-12-15 09:43:59 +01:00
Daniel Lezcano
39366ef421 clocksource/drivers/exynos_mct: Fix Kconfig and add COMPILE_TEST option
Let the platform's Kconfig to select the clock instead of having a reverse
dependency from the driver to the platform options.

Add the COMPILE_TEST option for the compilation test coverage. Due to the
non portable 'delay' code, this driver is only compilable on ARM.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Tested-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Reviewed-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Reviewed-by: Chanwoo Choi <cw00.choi@samsung.com>
2015-12-15 09:41:52 +01:00
Daniel Lezcano
1becd6edea clocksource/drivers/prcmu: Fix Kconfig and add COMPILE_TEST option
Let the platform's Kconfig to select the clock instead of having a reverse
dependency from the driver to the platform options.

Add the COMPILE_TEST option for the compilation test coverage.

This change is debatable as the option itself in the Kconfig allows to
select the driver for the platform or not. This change will make the prcmu
timer always selected.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
2015-12-15 09:41:50 +01:00
Daniel Lezcano
389d9b5841 clocksource/drivers/pxa_timer: Move the Kconfig rule
Instead of having the clocksource's Kconfig depending on the arch, let the
arch to select the timer it needs.

The CLKSRC_OF dependency is removed because already selected by the
ARCH_PXA, and it is added for SA1100.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2015-12-15 09:41:17 +01:00
Daniel Lezcano
2ffdf71b83 clocksource/drivers/st_lpc: Fix Kconfig dependency
Change the Kconfig selection rule by letting the STI arch to select
the timer.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Maxime Coquelin <maxime.coquelin@st.com>
2015-12-15 09:41:14 +01:00
Kevin Hilman
8fcacc0344 Allwinner fixes for 4.4
Two patches, one to fix the touchscreen axis on one Allwinner board, and
 the other one fixing a mutex unlocking issue on one error path in the RSB
 driver.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWbpHVAAoJEBx+YmzsjxAgbEQQALWoDBtYUwlUfWPL9hHIShbv
 HHkelqcxZgQB2LtNQMufwitmoLH8lhf5C0ILZuLiiA0EZM1V+PrsGLKRKHGeQxYO
 K6g3TVZp3v+Q+vb8T3FsGSbP1dtAm+IXIfknQBT+FLHnp84d0wRa29Kx4e3QFUmi
 8UieBQG4dd45ghv08w1jSa2D1nUEfuAMYG1jZIrTw/ZWTB3siZxKBKEaPccvDFF6
 1WgTwoYoHj05ldLG6W2/MPbW1eosVqZTPvdb+QEAvdDVQsKMOyCqc6O8+khbxoV/
 S5RGGh6tJ6xBbsOnzvqjNjSMCeL7SyiYAxMc5y3rEBfexdMUiJi0Laj1dYVcpuNT
 LERGmxw7FIJB2HNsMqjMa4+GB1+DZBplu3RwYZqplnGe1lT9Rk7Irj6muUuH5cBf
 Ubh0RN2Mo5QNAhkwo4TDqy7xzrYRjC/XOQeQx9SSbFHPGl/raTCm0yd31vUdrZXT
 BTW9V7hIJU6vt6+RKXOqtgjAOqcBLi3npeCrUVY11hrCylKP2gEtQ4dFFGNgUSBu
 eOzDvlrtxVavoQB+XQQ+Js5hZlnXNWYLxR3Bher1xc5NXCI/aHM4pmymSzHjrQmU
 aBfmsJYdWDL9fW18cR9vCgdS1sBIqdz9zu9GW5A8MVVf6s7KoM3vEDX7tzFrghfi
 +mxUdfjbuWHgUxmHMN8J
 =9lSI
 -----END PGP SIGNATURE-----

Merge tag 'sunxi-fixes-for-4.4' of https://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux into fixes

Merge "Allwinner fixes for 4.4" from Maxime Ripard:

Allwinner fixes for 4.4

Two patches, one to fix the touchscreen axis on one Allwinner board, and
the other one fixing a mutex unlocking issue on one error path in the RSB
driver.

* tag 'sunxi-fixes-for-4.4' of https://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux:
  bus: sunxi-rsb: unlock on error in sunxi_rsb_read()
  ARM: dts: sunxi: sun6i-a31s-primo81.dts: add touchscreen axis swapping property
2015-12-14 16:59:40 -08:00
Tony Luck
2780188515 [IA64] Enable mlock2 syscall for ia64
New system call added in
  commit a8ca5d0ecb
  mm: mlock: add new mlock system call

Signed-off-by: Tony Luck <tony.luck@intel.com>
2015-12-14 10:30:02 -08:00
Haozhong Zhang
81b1b9ca6d KVM: VMX: Fix host initiated access to guest MSR_TSC_AUX
The current handling of accesses to guest MSR_TSC_AUX returns error if
vcpu does not support rdtscp, though those accesses are initiated by
host. This can result in the reboot failure of some versions of
QEMU. This patch fixes this issue by passing those host initiated
accesses for further handling instead.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-14 16:24:08 +01:00
Boris Ostrovsky
20f36e0380 xen/x86/pvh: Use HVM's flush_tlb_others op
Using MMUEXT_TLB_FLUSH_MULTI doesn't buy us much since the hypervisor
will likely perform same IPIs as would have the guest.

More importantly, using MMUEXT_INVLPG_MULTI may not to invalidate the
guest's address on remote CPU (when, for example, VCPU from another guest
is running there).

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Suggested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-12-14 13:53:40 +00:00
Marc Zyngier
3ffa75cd18 arm64: KVM: Remove weak attributes
As we've now switched to the new world switch implementation,
remove the weak attributes, as nobody is supposed to override
it anymore.

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-12-14 11:30:44 +00:00
Marc Zyngier
23a13465c8 arm64: KVM: Cleanup asm-offset.c
As we've now rewritten most of our code-base in C, most of the
KVM-specific code in asm-offset.c is useless. Delete-time again!

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-12-14 11:30:43 +00:00
Marc Zyngier
9d8415d6c1 arm64: KVM: Turn system register numbers to an enum
Having the system register numbers as #defines has been a pain
since day one, as the ordering is pretty fragile, and moving
things around leads to renumbering and epic conflict resolutions.

Now that we're mostly acessing the sysreg file in C, an enum is
a much better type to use, and we can clean things up a bit.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-12-14 11:30:43 +00:00
Marc Zyngier
1ea66d27e7 arm64: KVM: Move away from the assembly version of the world switch
This is it. We remove all of the code that has now been rewritten.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-12-14 11:30:43 +00:00
Marc Zyngier
910917bb7d arm64: KVM: Map the kernel RO section into HYP
In order to run C code in HYP, we must make sure that the kernel's
RO section is mapped into HYP (otherwise things break badly).

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-12-14 11:30:43 +00:00
Marc Zyngier
044ac37d12 arm64: KVM: Add compatibility aliases
So far, we've implemented the new world switch with a completely
different namespace, so that we could have both implementation
compiled in.

Let's take things one step further by adding weak aliases that
have the same names as the original implementation. The weak
attributes allows the new implementation to be overriden by the
old one, and everything still work.

At a later point, we'll be able to simply drop the old code, and
everything will hopefully keep working, thanks to the aliases we
have just added. This also saves us repainting all the callers.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-12-14 11:30:42 +00:00
Marc Zyngier
53fd5b6487 arm64: KVM: Add panic handling
Add the panic handler, together with the small bits of assembly
code to call the kernel's panic implementation.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-12-14 11:30:42 +00:00
Marc Zyngier
2b28162cf6 arm64: KVM: HYP mode entry points
Add the entry points for HYP mode (both for hypercalls and
exception handling).

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-12-14 11:30:42 +00:00
Marc Zyngier
5eec0a91e3 arm64: KVM: Implement TLB handling
Implement the TLB handling as a direct translation of the assembly
code version.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-12-14 11:30:41 +00:00
Marc Zyngier
c13d1683df arm64: KVM: Implement fpsimd save/restore
Implement the fpsimd save restore, keeping the lazy part in
assembler (as returning to C would be overkill).

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-12-14 11:30:41 +00:00
Marc Zyngier
be901e9b15 arm64: KVM: Implement the core world switch
Implement the core of the world switch in C. Not everything is there
yet, and there is nothing to re-enter the world switch either.

But this already outlines the code structure well enough.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-12-14 11:30:41 +00:00