Name aliases in the SEAD-3 device tree serial0 & serial1, rather than
uart0 & uart1. This allows the core serial code to make use of the
aliases to ensure that the UARTs are consistently numbered as expected
rather than having the numbering depend upon probe order.
When translating YAMON-provided serial configuration to a device tree
stdout-path property adjust accordingly, such that we continue to
reference a valid alias.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/16183/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
YAMON can expose more than 256MB of RAM to Linux on Malta by passing an
ememsize environment variable with the full size, but the kernel then
needs to be careful to choose the corresponding physical memory regions,
avoiding the IO memory window. This is platform dependent, and on Malta
it also depends on the memory layout which varies between system
controllers.
Extend yamon_dt_amend_memory() to generically handle this by taking
[e]memsize bytes of memory from an array of memory regions passed in as
a new parameter. Board code provides this array as appropriate depending
on its own memory map.
[paul.burton@imgtec.com: SEAD-3 supports 384MB DDR from 0]
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/16182/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
In preparation for supporting other YAMON-using boards (Malta) & sharing
code to translate information from YAMON into device tree properties,
pull the code doing so for the kernel command line, system memory &
serial configuration out of the SEAD-3 board code.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/16181/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The SEAD-3 board doesn't & never has configured the GIC frequency.
Remove the timer node from the DT in order to avoid attempting to probe
the GIC clocksource/clockevent driver which will produce error messages
such as these during boot:
[ 0.000000] GIC frequency not specified.
[ 0.000000] Failed to initialize '/interrupt-controller@1b1c0000/timer': -22
[ 0.000000] clocksource_probe: no matching clocksources found
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/16188/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Adjust the atomic loop in the MIPS_ATOMIC_SET operation of the sysmips
system call to branch straight back to the linked load rather than
jumping via a different subsection (whose purpose remains a mystery to
me).
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/16150/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
EVA linked loads (LLE) and conditional stores (SCE) should be used on
EVA kernels for the MIPS_ATOMIC_SET operation of the sysmips system
call, or else the atomic set will apply to the kernel view of the
virtual address space (potentially unmapped on EVA kernels) rather than
the user view (TLB mapped).
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 3.15.x-
Patchwork: https://patchwork.linux-mips.org/patch/16151/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The MIPS sysmips system call handler may return directly from the
MIPS_ATOMIC_SET case (mips_atomic_set()) to syscall_exit. This path
restores the static (callee saved) registers, however they won't have
been saved on entry to the system call.
Use the save_static_function() macro to create a __sys_sysmips wrapper
function which saves the static registers before calling sys_sysmips, so
that the correct static register state is restored by syscall_exit.
Fixes: f1e39a4a61 ("MIPS: Rewrite sysmips(MIPS_ATOMIC_SET, ...) in C with inline assembler")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: stable@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/16149/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The inline asm retry check in the MIPS_ATOMIC_SET operation of the
sysmips system call has been backwards since commit f1e39a4a61 ("MIPS:
Rewrite sysmips(MIPS_ATOMIC_SET, ...) in C with inline assembler")
merged in v2.6.32, resulting in the non R10000_LLSC_WAR case retrying
until the operation was inatomic, before returning the new value that
was probably just written multiple times instead of the old value.
Invert the branch condition to fix that particular issue.
Fixes: f1e39a4a61 ("MIPS: Rewrite sysmips(MIPS_ATOMIC_SET, ...) in C with inline assembler")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: stable@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/16148/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Add a definition of the perf registers for the new I6500 core.
Since I6500 has the same event definitions as I6400, re-use the existing
i6400 map structures by renaming them to a slightly more generic
'i6x00_***_map'.
Signed-off-by: Marcin Nowakowski <marcin.nowakowski@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/16362/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Introduce the I6500 PRID & probe it just the same way as I6400. The MIPS
I6500 is the latest in Imagination Technologies' I-Class range of CPUs,
with a focus on scalability & heterogeneity. It introduces the notion of
multiple clusters to the MIPS Coherent Processing System, allowing for a
far higher total number of cores & threads in a system when compared
with its predecessors. Clusters don't need to be identical, and may
contain differing numbers of cores & IOCUs, or cores with differing
properties.
This patch alone adds the basic support for booting Linux on an I6500
CPU without support for any of its new functionality, for which support
will be introduced in further patches.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/16190/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Recent CPUs from Imagination Technologies such as the I6400 or P6600 are
able to speculatively fetch data from memory into caches. This means
that if used in a system with non-coherent DMA they require that caches
be invalidated after a device performs DMA, and before the CPU reads the
DMA'd data, in order to ensure that stale values weren't speculatively
prefetched.
Such CPUs also introduced Memory Accessibility Attribute Registers
(MAARs) in order to control the regions in which they are allowed to
speculate. Thus we can use the presence of MAARs as a good indication
that the CPU requires the above cache maintenance. Use the presence of
MAARs to determine the result of cpu_needs_post_dma_flush() in the
default case, in order to handle these recent CPUs correctly.
Note that the return type of cpu_needs_post_dma_flush() is changed to
bool, such that it's clearer what's happening when cpu_has_maar is cast
to bool for the return value. If this patch were backported to a
pre-v4.7 kernel then MIPS_CPU_MAAR was 1ull<<34, so when cast to an int
we would incorrectly return 0. It so happens that MIPS_CPU_MAAR is
currently 1ull<<30, so when truncated to an int gives a non-zero value
anyway, but even so the implicit conversion from long long int to bool
makes it clearer to understand what will happen than the implicit
conversion from long long int to int would. The bool return type also
fits this usage better semantically, so seems like an all-round win.
Thanks to Ed for spotting the issue for pre-v4.7 kernels & suggesting
the return type change.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Reviewed-by: Bryan O'Donoghue <pure.logic@nexus-software.ie>
Tested-by: Bryan O'Donoghue <pure.logic@nexus-software.ie>
Cc: Ed Blake <ed.blake@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/16363/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
KProbes of __seccomp_filter() are not very useful without access to
the syscall arguments.
Do what x86 does, and populate a struct seccomp_data to be passed to
__secure_computing(). This allows samples/bpf/tracex5 to extract a
sensible trace.
Signed-off-by: David Daney <david.daney@cavium.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Matt Redfearn <matt.redfearn@imgtec.com>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/16368/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Since the eBPF machine has 64-bit registers, we only support this in
64-bit kernels. As of the writing of this commit log test-bpf is showing:
test_bpf: Summary: 316 PASSED, 0 FAILED, [308/308 JIT'ed]
All current test cases are successfully compiled.
Many examples in samples/bpf are usable, specifically tracex5 which
uses tail calls works.
Signed-off-by: David Daney <david.daney@cavium.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Markos Chandras <markos.chandras@imgtec.com>
Cc: Matt Redfearn <matt.redfearn@imgtec.com>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/16369/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Instead of doing a linear search through the insn_table for each
instruction, use the opcode as direct index into the table. This will
give constant time lookup performance as the number of supported
opcodes increases. Make the tables const as they are only ever read.
For uasm-mips.c sort the table alphabetically, and remove duplicate
entries, uasm-micromips.c was already sorted and duplicate free.
There is a small savings in object size as struct insn loses a field:
$ size arch/mips/mm/uasm-mips.o arch/mips/mm/uasm-mips.o.save
text data bss dec hex filename
10040 0 0 10040 2738 arch/mips/mm/uasm-mips.o
9240 1120 0 10360 2878 arch/mips/mm/uasm-mips.o.save
Signed-off-by: David Daney <david.daney@cavium.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Matt Redfearn <matt.redfearn@imgtec.com>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/16365/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The module load code has previously had entirely separate
implementations for rel & rela style relocs, which unnecessarily
duplicates a whole lot of code. Unify the implementations of both types
of reloc, sharing the bulk of the code.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/15832/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
If we hit an error whilst processing a reloc then we would return early
from apply_relocate & potentially not free entries in r_mips_hi16_list,
thereby leaking memory. Fix this by ensuring that we always run the code
to free r_mipps_hi16_list when errors occur.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Fixes: 861667dc82 ("MIPS: Fix race condition in module relocation code.")
Fixes: 04211a5746 ("MIPS: Bail on unsupported module relocs")
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/15831/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Pull x86 fix from Thomas Gleixner:
"A single fix to unbreak the vdso32 build for 64bit kernels caused by
excess #includes in the mshyperv header"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mshyperv: Remove excess #includes from mshyperv.h
Pull timer fixes from Thomas Gleixner:
"A few fixes for timekeeping and timers:
- Plug a subtle race due to a missing READ_ONCE() in the timekeeping
code where reloading of a pointer results in an inconsistent
callback argument being supplied to the clocksource->read function.
- Correct the CLOCK_MONOTONIC_RAW sub-nanosecond accounting in the
time keeping core code, to prevent a possible discontuity.
- Apply a similar fix to the arm64 vdso clock_gettime()
implementation
- Add missing includes to clocksource drivers, which relied on
indirect includes which fails in certain configs.
- Use the proper iomem pointer for read/iounmap in a probe function"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
arm64/vdso: Fix nsec handling for CLOCK_MONOTONIC_RAW
time: Fix CLOCK_MONOTONIC_RAW sub-nanosecond accounting
time: Fix clock->read(clock) race around clocksource changes
clocksource: Explicitly include linux/clocksource.h when needed
clocksource/drivers/arm_arch_timer: Fix read and iounmap of incorrect variable
Pull perf fixes from Thomas Gleixner:
"Three fixlets for perf:
- Return the proper error code if aux buffers for a event are not
supported.
- Calculate the probe offset for inlined functions correctly
- Update the Skylake DTLB load/store miss event so it can count 1G
TLB entries as well"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf probe: Fix probe definition for inlined functions
perf/x86/intel: Add 1G DTLB load/store miss support for SKL
perf/aux: Correct return code of rb_alloc_aux() if !has_aux(ev)
A recent commit included linux/slab.h in linux/irq.h. This breaks the build
of vdso32 on a 64-bit kernel.
The reason is that linux/irq.h gets included into the vdso code via
linux/interrupt.h which is included from asm/mshyperv.h. That makes the
32-bit vdso compile fail, because slab.h includes the pgtable headers for
64-bit on a 64-bit build.
Neither linux/clocksource.h nor linux/interrupt.h are needed in the
mshyperv.h header file itself - it has a dependency on <linux/atomic.h>.
Remove the includes and unbreak the build.
Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: devel@linuxdriverproject.org
Fixes: dee863b571 ("hv: export current Hyper-V clocksource")
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1706231038460.2647@nanos
Signed-off-by: Ingo Molnar <mingo@kernel.org>
- three fixes for kprobes/ftrace/livepatch interactions.
- properly handle data breakpoints when using the Radix MMU.
- fix for perf sampling of registers during call_usermodehelper().
- properly initialise the thread_info on our emergency stacks
- add an explicit flush when doing TLB invalidations for a process
using NPU2.
Thanks to:
Alistair Popple, Naveen N. Rao, Nicholas Piggin, Ravi Bangoria,
Masami Hiramatsu.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJZTZy4AAoJEFHr6jzI4aWA9CYQAK+BIZ2wM+QEKDWUc7bHUBfJ
kVkFr59VS4x9w2zL2fKijy3CTNqaEXCUhmCks7PFYxGfF437YaJGVfCBVotuY9Ce
SKTkJujUUf7b1zN+lKz8d9u6AKomE9rYBLpR0LPhDrnpiLbHtyWCeFWsmOB63k4E
05EwIHGAlvIC/dc6bHoeJzSLT5agK2KcCVWjgVzZgkDi7sbYkE8qhPmo/cojSERo
48+o8beAKgU3YEI8OwraxYBlUR71DKfdL7+6xvEo8kVNj5iNMq5GWY+YLvcQgR50
3MLuGxWFZWVRfZY8rrLMajFxNXojwuWuLu/PTT0Kz2ZRgLseF+op0AH2Ezsw4pnZ
CLp0sSKs9BqpwKuFCb1lHiEVnGfOb9CFy3u0nWmQjsE0Bj8HRC433x4fNQcJVUmJ
ZMPXRtZaboPV9jt3UoUhtancMiXdAbTP48N7klFRuVwCOycnxW5yAFkCssFaSpsn
EAidzBDODUXUV6/3paNVsZD7ehVJ/FMBgKSyAoJrcr+RZeFbn4b9m/NvdpdhQIwn
iGrTMhz3YmEhxiZrStYB9aaeaaWKZxd120bnTcfFEcnMOCKUkBSICtqjGLVsBO5e
rQV9P97h+kxf+Wh7DqhkC7br7URpYsYDZa9bCd+SAL1qrGeNZW/RP01ABRZWiSi4
0QVvKZ7uVzyEHIVHXOoj
=a2Ax
-----END PGP SIGNATURE-----
Merge tag 'powerpc-4.12-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"Some more powerpc fixes for 4.12. Most of these actually came in last
week but got held up for some more testing.
- three fixes for kprobes/ftrace/livepatch interactions.
- properly handle data breakpoints when using the Radix MMU.
- fix for perf sampling of registers during call_usermodehelper().
- properly initialise the thread_info on our emergency stacks
- add an explicit flush when doing TLB invalidations for a process
using NPU2.
Thanks to: Alistair Popple, Naveen N. Rao, Nicholas Piggin, Ravi
Bangoria, Masami Hiramatsu"
* tag 'powerpc-4.12-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/64: Initialise thread_info for emergency stacks
powerpc/powernv/npu-dma: Add explicit flush when sending an ATSD
powerpc/perf: Fix oops when kthread execs user process
powerpc/64s: Handle data breakpoints in Radix mode
powerpc/kprobes: Skip livepatch_handler() for jprobes
powerpc/ftrace: Pass the correct stack pointer for DYNAMIC_FTRACE_WITH_REGS
powerpc/kprobes: Pause function_graph tracing during jprobes handling
Emergency stacks have their thread_info mostly uninitialised, which in
particular means garbage preempt_count values.
Emergency stack code runs with interrupts disabled entirely, and is
used very rarely, so this has been unnoticed so far. It was found by a
proposed new powerpc watchdog that takes a soft-NMI directly from the
masked_interrupt handler and using the emergency stack. That crashed
at BUG_ON(in_nmi()) in nmi_enter(). preempt_count()s were found to be
garbage.
To fix this, zero the entire THREAD_SIZE allocation, and initialize
the thread_info.
Cc: stable@vger.kernel.org
Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Move it all into setup_64.c, use a function not a macro. Fix
crashes on Cell by setting preempt_count to 0 not HARDIRQ_OFFSET]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
TF is handled a bit differently for syscall and sysret, compared
to the other instructions: TF is checked after the instruction completes,
so that the OS can disable #DB at a syscall by adding TF to FMASK.
When the sysret is executed the #DB is taken "as if" the syscall insn
just completed.
KVM emulates syscall so that it can trap 32-bit syscall on Intel processors.
Fix the behavior, otherwise you could get #DB on a user stack which is not
nice. This does not affect Linux guests, as they use an IST or task gate
for #DB.
This fixes CVE-2017-7518.
Cc: stable@vger.kernel.org
Reported-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
NPU2 requires an extra explicit flush to an active GPU PID when
sending address translation shoot downs (ATSDs) to reliably flush the
GPU TLB. This patch adds just such a flush at the end of each sequence
of ATSDs.
We can safely use PID 0 which is always reserved and active on the
GPU. PID 0 is only used for init_mm which will never be a user mm on
the GPU. To enforce this we add a check in pnv_npu2_init_context()
just in case someone tries to use PID 0 on the GPU.
Signed-off-by: Alistair Popple <alistair@popple.id.au>
[mpe: Use true/false for bool literals]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
For real-space designation asces the asce origin part is only a token.
The asce token origin must not be used to generate an effective
address for storage references. This however is erroneously done
within kvm_s390_shadow_tables().
Furthermore within the same function the wrong parts of virtual
addresses are used to generate a corresponding real address
(e.g. the region second index is used as region first index).
Both of the above can result in incorrect address translations. Only
for real space designations with a token origin of zero and addresses
below one megabyte the translation was correct.
Furthermore replace a "!asce.r" statement with a "!*fake" statement to
make it more obvious that a specific condition has nothing to do with
the architecture, but with the fake handling of real space designations.
Fixes: 3218f7094b ("s390/mm: support real-space for gmap shadows")
Cc: David Hildenbrand <david@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Current DTLB load/store miss events (0x608/0x649) only counts 4K,2M and
4M page size.
Need to extend the events to support any page size (4K/2M/4M/1G).
The complete DTLB load/store miss events are:
DTLB_LOAD_MISSES.WALK_COMPLETED 0xe08
DTLB_STORE_MISSES.WALK_COMPLETED 0xe49
Signed-off-by: Kan Liang <Kan.liang@intel.com>
Cc: <stable@vger.kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: eranian@google.com
Link: http://lkml.kernel.org/r/20170619142609.11058-1-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This commit fixes a "maybe-uninitialized" build failure in
arch/mips/kvm/tlb.c when KVM, DYNAMIC_DEBUG and JUMP_LABEL are all
enabled. The failure is:
In file included from ./include/linux/printk.h:329:0,
from ./include/linux/kernel.h:13,
from ./include/asm-generic/bug.h:15,
from ./arch/mips/include/asm/bug.h:41,
from ./include/linux/bug.h:4,
from ./include/linux/thread_info.h:11,
from ./include/asm-generic/current.h:4,
from ./arch/mips/include/generated/asm/current.h:1,
from ./include/linux/sched.h:11,
from arch/mips/kvm/tlb.c:13:
arch/mips/kvm/tlb.c: In function ‘kvm_mips_host_tlb_inv’:
./include/linux/dynamic_debug.h:126:3: error: ‘idx_kernel’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
__dynamic_pr_debug(&descriptor, pr_fmt(fmt), \
^~~~~~~~~~~~~~~~~~
arch/mips/kvm/tlb.c:169:16: note: ‘idx_kernel’ was declared here
int idx_user, idx_kernel;
^~~~~~~~~~
There is a similar error relating to "idx_user". Both errors were
observed with GCC 6.
As far as I can tell, it is impossible for either idx_user or idx_kernel
to be uninitialized when they are later read in the calls to kvm_debug,
but to satisfy the compiler, add zero initializers to both variables.
Signed-off-by: James Cowgill <James.Cowgill@imgtec.com>
Fixes: 57e3869cfa ("KVM: MIPS/TLB: Generalise host TLB invalidate to kernel ASID")
Cc: <stable@vger.kernel.org> # 4.11+
Acked-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
* fix problems that could cause hangs or crashes in the host on POWER9
* fix problems that could allow guests to potentially affect or disrupt
the execution of the controlling userspace
Recently vDSO support for CLOCK_MONOTONIC_RAW was added in
49eea433b3 ("arm64: Add support for CLOCK_MONOTONIC_RAW in
clock_gettime() vDSO"). Noticing that the core timekeeping code
never set tkr_raw.xtime_nsec, the vDSO implementation didn't
bother exposing it via the data page and instead took the
unshifted tk->raw_time.tv_nsec value which was then immediately
shifted left in the vDSO code.
Unfortunately, by accellerating the MONOTONIC_RAW clockid, it
uncovered potential 1ns time inconsistencies caused by the
timekeeping core not handing sub-ns resolution.
Now that the core code has been fixed and is actually setting
tkr_raw.xtime_nsec, we need to take that into account in the
vDSO by adding it to the shifted raw_time value, in order to
fix the user-visible inconsistency. Rather than do that at each
use (and expand the data page in the process), instead perform
the shift/addition operation when populating the data page and
remove the shift from the vDSO code entirely.
[jstultz: minor whitespace tweak, tried to improve commit
message to make it more clear this fixes a regression]
Reported-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Tested-by: Daniel Mentz <danielmentz@google.com>
Acked-by: Kevin Brodsky <kevin.brodsky@arm.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Stephen Boyd <stephen.boyd@linaro.org>
Cc: "stable #4 . 8+" <stable@vger.kernel.org>
Cc: Miroslav Lichvar <mlichvar@redhat.com>
Link: http://lkml.kernel.org/r/1496965462-20003-4-git-send-email-john.stultz@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Stack guard page is a useful feature to reduce a risk of stack smashing
into a different mapping. We have been using a single page gap which
is sufficient to prevent having stack adjacent to a different mapping.
But this seems to be insufficient in the light of the stack usage in
userspace. E.g. glibc uses as large as 64kB alloca() in many commonly
used functions. Others use constructs liks gid_t buffer[NGROUPS_MAX]
which is 256kB or stack strings with MAX_ARG_STRLEN.
This will become especially dangerous for suid binaries and the default
no limit for the stack size limit because those applications can be
tricked to consume a large portion of the stack and a single glibc call
could jump over the guard page. These attacks are not theoretical,
unfortunatelly.
Make those attacks less probable by increasing the stack guard gap
to 1MB (on systems with 4k pages; but make it depend on the page size
because systems with larger base pages might cap stack allocations in
the PAGE_SIZE units) which should cover larger alloca() and VLA stack
allocations. It is obviously not a full fix because the problem is
somehow inherent, but it should reduce attack space a lot.
One could argue that the gap size should be configurable from userspace,
but that can be done later when somebody finds that the new 1MB is wrong
for some special case applications. For now, add a kernel command line
option (stack_guard_gap) to specify the stack gap size (in page units).
Implementation wise, first delete all the old code for stack guard page:
because although we could get away with accounting one extra page in a
stack vma, accounting a larger gap can break userspace - case in point,
a program run with "ulimit -S -v 20000" failed when the 1MB gap was
counted for RLIMIT_AS; similar problems could come with RLIMIT_MLOCK
and strict non-overcommit mode.
Instead of keeping gap inside the stack vma, maintain the stack guard
gap as a gap between vmas: using vm_start_gap() in place of vm_start
(or vm_end_gap() in place of vm_end if VM_GROWSUP) in just those few
places which need to respect the gap - mainly arch_get_unmapped_area(),
and and the vma tree's subtree_gap support for that.
Original-patch-by: Oleg Nesterov <oleg@redhat.com>
Original-patch-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Stream of fixes has slowed down, only a few this week:
- Some DT fixes for Allwinner platforms, and addition of a clock to
the R_CCU clock controller that had been missed.
- A couple of small DT fixes for am335x-sl50.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJZR3FwAAoJEIwa5zzehBx3MmsQAJ+VV9hfqtXihWZFTqM+RqLO
0qn4BU+zpxS/4TK/cVXy69/PNXRJJqq0hPfmyfBPj3Mm5revejFaFg7w620mBvUy
01w1Wu2bLf6HG+9PBjmwBl9CIG4qjSHQXKvkT3A/ZVvV1zw+V/Yvs48Y7e7CDYMc
or+URw9JS5R8UZdJ03oklnkNdSRLfXCfjfwKz6Hn1WmZ30Gsg74DYBuGzvL2wFRx
qyBaNwTaItipiIIPSzrns4yexpujYwzMxypIF6q9cHXfmnA669NwHCUwhZawdvQi
ibEoGxTpisjus07/y+zcar73f+NFN3QVtKdTi+XxYTKBPH3OxU4d4DbbE4EBpazk
G/I8ZVZ87tpuskkLegTuXDjsgfsVJTdBt+Rck4+MGiP/4DccOXXauEsGhbryk5Jg
TB6r45tf9pDpoYiCF0JIkkl9TLEv4hUXgIYZBYtH1lFXbSVkGpk1y+ZM3SrgSoP1
U2wAY6vxAB6taGHI/99i3/8VI5Fd7Q06XpaGVyk9ET7pRc5Lvpbz9255jpLOasf/
8ZkaVk3yM9mzcSEezHohzQd2en1sIvA6gZbLFMBL9UoLBgrtbSJPQCIalnRelwJf
SZoO/mDmgYAr3Tq3NuYUI4dp1U49q5nGme6ujm98Hg5VdH/50ZDwidaFS/N+Ba71
gIc2TLD0OMC/zhuOOBaE
=pi+d
-----END PGP SIGNATURE-----
Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"Stream of fixes has slowed down, only a few this week:
- Some DT fixes for Allwinner platforms, and addition of a clock to
the R_CCU clock controller that had been missed.
- A couple of small DT fixes for am335x-sl50"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
arm64: allwinner: a64: Add PLL_PERIPH0 clock to the R_CCU
ARM: sunxi: h3-h5: Add PLL_PERIPH0 clock to the R_CCU
ARM: dts: am335x-sl50: Fix cannot claim requested pins for spi0
ARM: dts: am335x-sl50: Fix card detect pin for mmc1
arm64: allwinner: h5: Remove syslink to shared DTSI
ARM: sunxi: h3/h5: fix the compatible of R_CCU
A few fixes around the PRCM support that got in 4.12 with a wrong
compatible, and a missing clock in the binding.
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJZQZkZAAoJEBx+YmzsjxAg4nAQAJJqzy8//Ur0o6Ppc6eufIAY
gYGS80x5n/a6/X7PPMj/cMBVc1/HoOoF5YVKry2edRi4jKwpBjCE6THNJ/EdI0LM
6PrN4y9yJAzxbwWfD9rfVNLg54665TGW8etBp5C3Sqdi+qmU9BTL068UYGcA46I8
XGZ53wGLnCfRH5VGpVzxORbdQMStKsZ2D0PTmZ7aJU2nrPugbf4DiGg2Uhgdx+bI
Mz3Zl6cZQraWdl6gSVTjG1Z5LQOKo5tXGIaC4zxbXe/Ss2lspxM3WKtJDhdtoTH9
ZiLGDf6Q3XUeMN5WkQNT6ZnT8+/8NQujhcktEfxhfiA5pGeHLzuOCaOLgEWVy8sc
Z6jNLHUht3W/XGOGY0szKfqmsOnDdnsnv3YbCUoWJ/0ER8kwQdJ8k4iI1EVMi23Q
UcXDiZivCgjj7mYOzfhG2YYZ03rxhadPqsnoDro/a+mI6splPhQplJC/we8TUYHt
eJmXF3rvOgXGYJAdnF8FJiftzUKUd0h8S8qsxBB3knP4mPY73vtppsvJwPOqlil2
EPcqHmcd3EvHRZrmHNsP7qpQXMiaWqImf6Ioq7hz4mPPJ5uiIPrEIdRmACCAdn9H
eeOQyI0rdg3bTCKi/dztaX4zVCMjEy9HG0xZ/Y6dvPwKjuWTpJApsTC+xEaiql2L
gQ+OFMX4mvgr79OUNp7L
=WD1J
-----END PGP SIGNATURE-----
Merge tag 'sunxi-fixes-for-4.12' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into fixes
Allwinner fixes for 4.12
A few fixes around the PRCM support that got in 4.12 with a wrong
compatible, and a missing clock in the binding.
* tag 'sunxi-fixes-for-4.12' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux:
arm64: allwinner: a64: Add PLL_PERIPH0 clock to the R_CCU
ARM: sunxi: h3-h5: Add PLL_PERIPH0 clock to the R_CCU
arm64: allwinner: h5: Remove syslink to shared DTSI
ARM: sunxi: h3/h5: fix the compatible of R_CCU
Signed-off-by: Olof Johansson <olof@lixom.net>
for claiming SPI pins, and to fix a SDIO card detect
pin for production version of the device.
-----BEGIN PGP SIGNATURE-----
iQJFBAABCAAvFiEEkgNvrZJU/QSQYIcQG9Q+yVyrpXMFAlk3q9cRHHRvbnlAYXRv
bWlkZS5jb20ACgkQG9Q+yVyrpXMFbRAAx9XBHtsHPI0AcWEagprXFoeKClEmlV33
Az06mEwgWKM0IMKELdstFLgpT80zp71v44k61vxWHudJi3a3rEbcDiujmttKg4II
fL3CfjpYS3son449VZWc/X0ZsvU/5vhusDkx8QwGWK1FgfOLVNvwNtgxWr3roUFC
qUQZImkTvgKYuhoaoV5Sx0VtUEJ9ukLNAqjy0HTfTU15jkX4/nqmlB+8ov2oamYv
r502+LEzNVoCmlYW1SBH+yn7zebIad2UtRZqz+TV8FnJl3p4yFI1Odvo8hbODTzT
vo+YsSmhd23eKnw+yXpqdPYng3yXH5Co3vLFumpBcyNkZlCtSokLrJ1/VOjtTNVn
oM+gsR77I6zGknFdiBzITLJRFABHlziSfeaDyyhfQX9CAfrCv9t5hiqn6KwRuiGj
H1hD2JkFWeJLLPGO4YauJk5PLcAdMILfjHRHW7lBeKaT76nH4Ha5PgpCj22hNSCo
m1EDdR30QzkfL28iYp8LM9yNhkm05Y8VG517AmIrzJUl6/RhTRH2f5d6+ChTQ7AA
cp89ITqEeccCso6xULCPjqaGIuYCEopDFjQN/WP8Pt5bXweiiQ6HpAgMDp2E5PfD
3yKwYvXDyl5HIa+ZbEDPLuna1OjDu7BrCe+nuDwsVfeM0fOHViABPl6mAvH7VXvb
nQhOVeQSP8I=
=e4F/
-----END PGP SIGNATURE-----
Merge tag 'omap-for-v4.12/fixes-sl50' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes
Two fixes for am335x-sl50 to fix a boot time error
for claiming SPI pins, and to fix a SDIO card detect
pin for production version of the device.
* tag 'omap-for-v4.12/fixes-sl50' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: dts: am335x-sl50: Fix cannot claim requested pins for spi0
ARM: dts: am335x-sl50: Fix card detect pin for mmc1
Signed-off-by: Olof Johansson <olof@lixom.net>
Pull MIPS fixes from Ralf Baechle:
- Three highmem fixes:
+ Fixed mapping initialization
+ Adjust the pkmap location
+ Ensure we use at most one page for PTEs
- Fix makefile dependencies for .its targets to depend on vmlinux
- Fix reversed condition in BNEZC and JIALC software branch emulation
- Only flush initialized flush_insn_slot to avoid NULL pointer
dereference
- perf: Remove incorrect odd/even counter handling for I6400
- ftrace: Fix init functions tracing
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: .its targets depend on vmlinux
MIPS: Fix bnezc/jialc return address calculation
MIPS: kprobes: flush_insn_slot should flush only if probe initialised
MIPS: ftrace: fix init functions tracing
MIPS: mm: adjust PKMAP location
MIPS: highmem: ensure that we don't use more than one page for PTEs
MIPS: mm: fixed mappings: correct initialisation
MIPS: perf: Remove incorrect odd/even counter handling for I6400
Pull x86 fixes from Thomas Gleixner:
"Two fixlets for x86:
- Handle WARN_ONs proper with the new UD based WARN implementation
- Disable 1G mappings when 2M mappings are disabled by kmemleak or
debug_pagealloc. Otherwise 1G mappings might still be used,
confusing the debug mechanisms"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Disable 1GB direct mappings when disabling 2MB mappings
x86/debug: Handle early WARN_ONs proper
Three small fixes for recently merged code:
- remove a spurious WARN_ON when a PCI device has no of_node, it's allowed in
some circumstances for there to be no of_node.
- fix the offset for store EOI MMIOs in the XIVE interrupt controller.
- fix non-const WARN_ONs which were becoming BUGs due to them losing
BUGFLAG_WARNING in a recent cleanup patch.
Thanks to:
Alexey Kardashevskiy, Alistair Popple, Benjamin Herrenschmidt.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJZQ7XeAAoJEFHr6jzI4aWA2bgP/R++c9YehNdDKbZyLqumY+6q
7ns8NoxgEW/Gc8JuSTE4MW51q2HimBJu6ntyHfMUwgpGQpGzOGDn6g8OxVfjOySa
kFc7cytgOOhEpTHENDZ3xxZtcSd9iafyX9ga/0dz6UycfEHcZayiXDRuXffRzJwa
RNqbwDxOtkgn6w4bW02SRlfDSTra+zQZQd6NsPXSJJgF+tb3MflMj1A9WoJp/mj/
tXc9fpKQsZkIG/AvAHziizHqeAKJxUrmoVb8qy1SYTKVUDZoxTYgiO1G1nebZX/s
Zzsdd/fcHcd0DIEJkjf2V3cegmIGTLzw7mUOodU7IF3mZ1LPgCMVF5lTTZzjcXDQ
d1gugVojHnGr7KB3lNNijyHxsmHG7LdTQmRHcyZ2L8KYpa3/+Ca3ZuFnTwjvgRNx
dJEFX5JdAhCrkg1B73rvcjKCFg0ysVIrkdf27SaameaQdQQuZU4+5+s1LB2EqJQr
II3+pnZr/RF3OWu4yJE5KAHX5ZBQQ+unzVPpW4pqvwYMoVKhO7dhCPPISeRCtzJE
+po5Ys4ncheSRhwf5dQhf+H04kXmL6ekpl1GJOBB3BskJcsIr8hiLp3/mF238et1
80o6yTAJLADKUIl75ISiePz+KFZNamgke1/XWZolfHYZ9dNRF0c//E0qvpopz8jE
F90hxEAtJ9ws/VUlo40Q
=Mnxp
-----END PGP SIGNATURE-----
Merge tag 'powerpc-4.12-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"Three small fixes for recently merged code:
- remove a spurious WARN_ON when a PCI device has no of_node, it's
allowed in some circumstances for there to be no of_node.
- fix the offset for store EOI MMIOs in the XIVE interrupt
controller.
- fix non-const WARN_ONs which were becoming BUGs due to them losing
BUGFLAG_WARNING in a recent cleanup patch.
Thanks to: Alexey Kardashevskiy, Alistair Popple, Benjamin
Herrenschmidt"
* tag 'powerpc-4.12-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/debug: Add missing warn flag to WARN_ON's non-builtin path
powerpc/xive: Fix offset for store EOI MMIOs
powerpc/npu-dma: Remove spurious WARN_ON when a PCI device has no of_node
When a kthread calls call_usermodehelper() the steps are:
1. allocate current->mm
2. load_elf_binary()
3. populate current->thread.regs
While doing this, interrupts are not disabled. If there is a perf
interrupt in the middle of this process (i.e. step 1 has completed
but not yet reached to step 3) and if perf tries to read userspace
regs, kernel oops with following log:
Unable to handle kernel paging request for data at address 0x00000000
Faulting instruction address: 0xc0000000000da0fc
...
Call Trace:
perf_output_sample_regs+0x6c/0xd0
perf_output_sample+0x4e4/0x830
perf_event_output_forward+0x64/0x90
__perf_event_overflow+0x8c/0x1e0
record_and_restart+0x220/0x5c0
perf_event_interrupt+0x2d8/0x4d0
performance_monitor_exception+0x54/0x70
performance_monitor_common+0x158/0x160
--- interrupt: f01 at avtab_search_node+0x150/0x1a0
LR = avtab_search_node+0x100/0x1a0
...
load_elf_binary+0x6e8/0x15a0
search_binary_handler+0xe8/0x290
do_execveat_common.isra.14+0x5f4/0x840
call_usermodehelper_exec_async+0x170/0x210
ret_from_kernel_thread+0x5c/0x7c
Fix it by setting abi to PERF_SAMPLE_REGS_ABI_NONE when userspace
pt_regs are not set.
Fixes: ed4a4ef85c ("powerpc/perf: Add support for sampling interrupt register state")
Cc: stable@vger.kernel.org # v4.7+
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
On Power9, trying to use data breakpoints throws the splat shown
below. This is because the check for a data breakpoint in DSISR is in
do_hash_page(), which is not called when in Radix mode.
Unable to handle kernel paging request for data at address 0xc000000000e19218
Faulting instruction address: 0xc0000000001155e8
cpu 0x0: Vector: 300 (Data Access) at [c0000000ef1e7b20]
pc: c0000000001155e8: find_pid_ns+0x48/0xe0
lr: c000000000116ac4: find_task_by_vpid+0x44/0x90
sp: c0000000ef1e7da0
msr: 9000000000009033
dar: c000000000e19218
dsisr: 400000
Move the check to handle_page_fault() so as to catch data breakpoints
in both Hash and Radix MMU modes.
We have to change the check in do_hash_page() against 0xa410 to use
0xa450, so as to include the value of (DSISR_DABRMATCH << 16).
There are two sites that call handle_page_fault() when in Radix, both
already pass DSISR in r4.
Fixes: caca285e5a ("powerpc/mm/radix: Use STD_MMU_64 to properly isolate hash related code")
Cc: stable@vger.kernel.org # v4.7+
Reported-by: Shriya R. Kulkarni <shriykul@in.ibm.com>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
[mpe: Fix the fall-through case on hash, we need to reload DSISR]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
ftrace_caller() depends on a modified regs->nip to detect if a certain
function has been livepatched. However, with KPROBES_ON_FTRACE, it is
possible for regs->nip to have been modified by the kprobes pre_handler
(jprobes, for instance). In this case, we do not want to invoke the
livepatch_handler so as not to consume the livepatch stack.
To distinguish between the two (kprobes and livepatch), we check if
there is an active kprobe on the current function. If there is, then we
know for sure that it must have modified the NIP as we don't support
livepatching a kprobe'd function. In this case, we simply skip the
livepatch_handler and branch to the new NIP. Otherwise, the
livepatch_handler is invoked.
Fixes: ead514d5fb ("powerpc/kprobes: Add support for KPROBES_ON_FTRACE")
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
For DYNAMIC_FTRACE_WITH_REGS, we should be passing-in the original set
of registers in pt_regs, to capture the state _before_ ftrace_caller.
However, we are instead passing the stack pointer *after* allocating a
stack frame in ftrace_caller. Fix this by saving the proper value of r1
in pt_regs. Also, use SAVE_10GPRS() to simplify the code.
Fixes: 153086644f ("powerpc/ftrace: Add support for -mprofile-kernel ftrace ABI")
Cc: stable@vger.kernel.org # v4.6+
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This fixes a crash when function_graph and jprobes are used together.
This is essentially commit 237d28db03 ("ftrace/jprobes/x86: Fix
conflict between jprobes and function graph tracing"), but for powerpc.
Jprobes breaks function_graph tracing since the jprobe hook needs to use
jprobe_return(), which never returns back to the hook, but instead to
the original jprobe'd function. The solution is to momentarily pause
function_graph tracing before invoking the jprobe hook and re-enable it
when returning back to the original jprobe'd function.
Fixes: 6794c78243 ("powerpc64: port of the function graph tracer")
Cc: stable@vger.kernel.org # v2.6.30+
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
When trapped on WARN_ON(), report_bug() is expected to return
BUG_TRAP_TYPE_WARN so the caller will increment NIP by 4 and continue.
The __builtin_constant_p() path of the PPC's WARN_ON()
calls (indirectly) __WARN_FLAGS() which has BUGFLAG_WARNING set,
however the other branch does not which makes report_bug() report a
bug rather than a warning.
Fixes: f26dee1510 ("debug: Avoid setting BUGFLAG_WARNING twice")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
POWER9 DD1 has an erratum where writing to the TBU40 register, which
is used to apply an offset to the timebase, can cause the timebase to
lose counts. This results in the timebase on some CPUs getting out of
sync with other CPUs, which then results in misbehaviour of the
timekeeping code.
To work around the problem, we make KVM ignore the timebase offset for
all guests on POWER9 DD1 machines. This means that live migration
cannot be supported on POWER9 DD1 machines.
Cc: stable@vger.kernel.org # v4.10+
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
At present, HV KVM on POWER8 and POWER9 machines loses any instruction
or data breakpoint set in the host whenever a guest is run.
Instruction breakpoints are currently only used by xmon, but ptrace
and the perf_event subsystem can set data breakpoints as well as xmon.
To fix this, we save the host values of the debug registers (CIABR,
DAWR and DAWRX) before entering the guest and restore them on exit.
To provide space to save them in the stack frame, we expand the stack
frame allocated by kvmppc_hv_entry() from 112 to 144 bytes.
Fixes: b005255e12 ("KVM: PPC: Book3S HV: Context-switch new POWER8 SPRs", 2014-01-08)
Cc: stable@vger.kernel.org # v3.14+
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Architecturally we should apply a 0x400 offset for these. Not doing
it will break future HW implementations.
The offset of 0 is supposed to remain for "triggers" though not all
sources support both trigger and store EOI, and in P9 specifically,
some sources will treat 0 as a store EOI. But future chips will not.
So this makes us use the properly architected offset which should work
always.
Fixes: 243e25112d ("powerpc/xive: Native exploitation of the XIVE interrupt controller")
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>