Commit graph

13326 commits

Author SHA1 Message Date
KOSAKI Motohiro
b80ef10e84 x86: Move do_page_fault()'s error path under unlikely()
Ingo suggested SIGKILL check should be moved into slowpath
function. This will reduce the page fault fastpath impact
of this recent commit:

  37b23e0525: x86,mm: make pagefault killable

Suggested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: kamezawa.hiroyu@jp.fujitsu.com
Cc: minchan.kim@gmail.com
Cc: willy@linux.intel.com
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/4DDE0B5C.9050907@jp.fujitsu.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-26 13:54:03 +02:00
Ingo Molnar
de66ee979d Merge branch 'linus' into x86/urgent
Merge reason: we want to queue up a dependent patch.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-26 13:51:35 +02:00
Matthew Garrett
916f676f8d x86, efi: Retain boot service code until after switching to virtual mode
UEFI stands for "Unified Extensible Firmware Interface", where "Firmware"
is an ancient African word meaning "Why do something right when you can
do it so wrong that children will weep and brave adults will cower before
you", and "UEI" is Celtic for "We missed DOS so we burned it into your
ROMs". The UEFI specification provides for runtime services (ie, another
way for the operating system to be forced to depend on the firmware) and
we rely on these for certain trivial tasks such as setting up the
bootloader. But some hardware fails to work if we attempt to use these
runtime services from physical mode, and so we have to switch into virtual
mode. So far so dreadful.

The specification makes it clear that the operating system is free to do
whatever it wants with boot services code after ExitBootServices() has been
called. SetVirtualAddressMap() can't be called until ExitBootServices() has
been. So, obviously, a whole bunch of EFI implementations call into boot
services code when we do that. Since we've been charmingly naive and
trusted that the specification may be somehow relevant to the real world,
we've already stuffed a picture of a penguin or something in that address
space. And just to make things more entertaining, we've also marked it
non-executable.

This patch allocates the boot services regions during EFI init and makes
sure that they're executable. Then, after SetVirtualAddressMap(), it
discards them and everyone lives happily ever after. Except for the ones
who have to work on EFI, who live sad lives haunted by the knowledge that
someone's eventually going to write yet another firmware specification.

[ hpa: adding this to urgent with a stable tag since it fixes currently-broken
  hardware.  However, I do not know what the dependencies are and so I do
  not know which -stable versions this may be a candidate for. ]

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Link: http://lkml.kernel.org/r/1306331593-28715-1-git-send-email-mjg@redhat.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: <stable@kernel.org>
2011-05-25 17:03:53 -07:00
0c63e38a12 Merge branch 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging
* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
  hwmon: New driver for the SMSC EMC6W201
  hwmon: (abituguru) Depend on DMI
  hwmon: (it87) Use request_muxed_region
  hwmon: (sch5627) Trigger Vbat measurements
  hwmon: (sch5627) Add sch5627_send_cmd function
  i8k: Integrate with the hwmon subsystem
  hwmon: (max6650) Properly support the MAX6650
  hwmon: (max6650) Drop device detection
  Move ACPI power meter driver to hwmon
  hwmon: (f71882fg) Add support for F71808A
  hwmon: (f71882fg) Split has_beep in fan_has_beep and temp_has_beep
  hwmon: (asc7621) Drop duplicate dependency
  hwmon: (jc42) Change detection class
  hwmon: Add driver for AMD family 15h processor power information
  hwmon: (k10temp) Add support for Fam15h (Bulldozer)
  hwmon: Use helper functions to set and get driver data
  i8k: Avoid lahf in 64-bit code
2011-05-25 16:52:50 -07:00
Nikhil P Rao
8b27f2ff7a x86: Remove unnecessary check in detect_ht()
This patch removes a check that causes incorrect scheduler
domain setup (SMP instead of SMT) and bootlog warning messages
when cpuid extensions for topology enumeration are not supported
and the number of processors reported to the OS is smaller than
smp_num_siblings.

Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Nikhil P Rao <nikhil.rao@intel.com>
Link: http://lkml.kernel.org/r/1306343921.19325.1.camel@fedora13
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-25 23:01:08 +02:00
Jean Delvare
949a9d7002 i8k: Integrate with the hwmon subsystem
Let i8k create an hwmon class device so that libsensors will expose
the CPU temperature and fan speeds to monitoring applications.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Guenter Roeck <guenter.roeck@ericsson.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Massimo Dal Zotto <dz@debian.org>
2011-05-25 20:43:33 +02:00
Stephen Boyd
5ca43f6c3b lib: consolidate DEBUG_STACK_USAGE option
Most arches define CONFIG_DEBUG_STACK_USAGE exactly the same way.  Move it
to lib/Kconfig.debug so each arch doesn't have to define it.  This
obviously makes the option generic, but that's fine because the config is
already used in generic code.

It's not obvious to me that sysrq-P actually does anything caution by
keeping the most inclusive wording.

Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Richard Weinberger <richard@nod.at>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Mackerras <paulus@samba.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Chen Liqin <liqin.chen@sunplusct.com>
Cc: Lennox Wu <lennox.wu@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:54 -07:00
Stephen Boyd
44ec7abe35 lib: consolidate DEBUG_PER_CPU_MAPS
DEBUG_PER_CPU_MAPS is used in lib/cpumask.c as well as in
inlcude/linux/cpumask.h and thus it has outgrown its use within x86 and
powerpc alone.  Any arch with SMP support may want to get some more
debugging, so make this option generic.

Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Cc: <linux-arch@vger.kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:53 -07:00
Mike Travis
162a7e7500 printk: allocate kernel log buffer earlier
On larger systems, because of the numerous ACPI, Bootmem and EFI messages,
the static log buffer overflows before the larger one specified by the
log_buf_len param is allocated.  Minimize the overflow by allocating the
new log buffer as soon as possible.

On kernels without memblock, a later call to setup_log_buf from
kernel/init.c is the fallback.

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: fix CONFIG_PRINTK=n build]
Signed-off-by: Mike Travis <travis@sgi.com>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:48 -07:00
Roland Dreier
dbee8a0aff x86: remove 32-bit versions of readq()/writeq()
The presense of a writeq() implementation on 32-bit x86 that splits the
64-bit write into two 32-bit writes turns out to break the mpt2sas driver
(and in general is risky for drivers as was discussed in
<http://lkml.kernel.org/r/adaab6c1h7c.fsf@cisco.com>).  To fix this,
revert 2c5643b1c5 ("x86: provide readq()/writeq() on 32-bit too") and
follow-on cleanups.

This unfortunately leads to pushing non-atomic definitions of readq() and
write() to various x86-only drivers that in the meantime started using the
definitions in the x86 version of <asm/io.h>.  However as discussed
exhaustively, this is actually the right thing to do, because the right
way to split a 64-bit transaction is hardware dependent and therefore
belongs in the hardware driver (eg mpt2sas needs a spinlock to make sure
no other accesses occur in between the two halves of the access).

Build tested on 32- and 64-bit x86 allmodconfig.

Link: http://lkml.kernel.org/r/x86-32-writeq-is-broken@mdm.bga.com
Acked-by: Hitoshi Mitake <h.mitake@gmail.com>
Cc: Kashyap Desai <Kashyap.Desai@lsi.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Ravi Anand <ravi.anand@qlogic.com>
Cc: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Cc: Matthew Garrett <mjg@redhat.com>
Cc: Jason Uhlenkott <juhlenko@akamai.com>
Acked-by: James Bottomley <James.Bottomley@parallels.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:44 -07:00
Ying Han
1495f230fa vmscan: change shrinker API by passing shrink_control struct
Change each shrinker's API by consolidating the existing parameters into
shrink_control struct.  This will simplify any further features added w/o
touching each file of shrinker.

[akpm@linux-foundation.org: fix build]
[akpm@linux-foundation.org: fix warning]
[kosaki.motohiro@jp.fujitsu.com: fix up new shrinker API]
[akpm@linux-foundation.org: fix xfs warning]
[akpm@linux-foundation.org: update gfs2]
Signed-off-by: Ying Han <yinghan@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Acked-by: Pavel Emelyanov <xemul@openvz.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:26 -07:00
KOSAKI Motohiro
de03c72cfc mm: convert mm->cpu_vm_cpumask into cpumask_var_t
cpumask_t is very big struct and cpu_vm_mask is placed wrong position.
It might lead to reduce cache hit ratio.

This patch has two change.
1) Move the place of cpumask into last of mm_struct. Because usually cpumask
   is accessed only front bits when the system has cpu-hotplug capability
2) Convert cpu_vm_mask into cpumask_var_t. It may help to reduce memory
   footprint if cpumask_size() will use nr_cpumask_bits properly in future.

In addition, this patch change the name of cpu_vm_mask with cpu_vm_mask_var.
It may help to detect out of tree cpu_vm_mask users.

This patch has no functional change.

[akpm@linux-foundation.org: build fix]
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:21 -07:00
Peter Zijlstra
3d48ae45e7 mm: Convert i_mmap_lock to a mutex
Straightforward conversion of i_mmap_lock to a mutex.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Miller <davem@davemloft.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Tony Luck <tony.luck@intel.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:18 -07:00
Peter Zijlstra
1c39517696 mm: now that all old mmu_gather code is gone, remove the storage
Fold all the mmu_gather rework patches into one for submission

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reported-by: Hugh Dickins <hughd@google.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Miller <davem@davemloft.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Tony Luck <tony.luck@intel.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:16 -07:00
KOSAKI Motohiro
37b23e0525 x86,mm: make pagefault killable
When an oom killing occurs, almost all processes are getting stuck at the
following two points.

	1) __alloc_pages_nodemask
	2) __lock_page_or_retry

1) is not very problematic because TIF_MEMDIE leads to an allocation
failure and getting out from page allocator.

2) is more problematic.  In an OOM situation, zones typically don't have
page cache at all and memory starvation might lead to greatly reduced IO
performance.  When a fork bomb occurs, TIF_MEMDIE tasks don't die quickly,
meaning that a fork bomb may create new process quickly rather than the
oom-killer killing it.  Then, the system may become livelocked.

This patch makes the pagefault interruptible by SIGKILL.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:08 -07:00
Richard Kennedy
af6a25f0e1 x86: Reorder mm_context_t to remove x86_64 alignment padding and thus shrink mm_struct
Reorder mm_context_t to remove alignment padding on 64 bit
builds shrinking its size from 64 to 56 bytes.

This allows mm_struct to shrink from 840 to 832 bytes, so using
one fewer cache lines, and getting more objects per slab when
using slub.

slabinfo mm_struct reports
before :-

    Sizes (bytes)     Slabs
    -----------------------------------
    Object :     840  Total  :       7
    SlabObj:     896  Full   :       1
    SlabSiz:   16384  Partial:       4
    Loss   :      56  CpuSlab:       2
    Align  :      64  Objects:      18

after :-

    Sizes (bytes)     Slabs
    ----------------------------------
    Object :     832  Total  :       7
    SlabObj:     832  Full   :       1
    SlabSiz:   16384  Partial:       4
    Loss   :       0  CpuSlab:       2
    Align  :      64  Objects:      19

Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk>
Cc: wilsons@start.ca
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Pekka Enberg <penberg@kernel.org>
Link: http://lkml.kernel.org/r/1306244999.1999.5.camel@castor.rsk
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-25 16:16:41 +02:00
Cliff Wickman
f073cc8f39 x86, UV: Clean up uv_tlb.c
SGI UV's uv_tlb.c driver has become rather hard to read, with overly large
functions, non-standard coding style and (way) too long variable, constant
and function names and non-obvious code flow sequences.

This patch improves the readability and maintainability of the driver
significantly, by doing the following strict code cleanups with no side
effects:

 - Split long functions into shorter logical functions.

 - Shortened some variable and structure member names.

 - Added special functions for reads and writes of MMR regs with
   very long names.

 - Added the 'tunables' table to shortened tunables_write().

 - Added the 'stat_description' table to shorten uv_ptc_proc_write().

 - Pass fewer 'stat' arguments where it can be derived from the 'bcp'
   argument.

 - Function definitions consistent on one line, and inline in few (short) cases.

 - Moved some small structures and an atomic inline function to the header file.

 - Moved some local variables to the blocks where they are used.

 - Updated the copyright date.

 - Shortened uv_write_global_mmr64() etc. using some aliasing; no
   line breaks. Renamed many uv_.. functions that are not exported.

 - Aligned structure fields.
    [ note that not all structures are aligned the same way though; I'd like
      to keep the extensive commenting in some of them. ]

 - Shortened some long structure names.

 - Standard pass/fail exit from init_per_cpu()

 - Vertical alignment for mass initializations.

 - More separation between blocks of code.

Tested on a 16-processor Altix UV.

Signed-off-by: Cliff Wickman <cpw@sgi.com>
Cc: penberg@kernel.org
Link: http://lkml.kernel.org/r/E1QOw12-0004MN-Lp@eag09.americas.sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-25 14:20:14 +02:00
Jack Steiner
2a919596c1 x86, UV: Add support for SGI UV2 hub chip
This patch adds support for a new version of the SGI UV hub
chip. The hub chip is the node controller that connects multiple
blades into a larger coherent SSI.

For the most part, UV2 is compatible with UV1. The majority of
the changes are in the addresses of MMRs and in a few cases, the
contents of MMRs. These changes are the result in changes in the
system topology such as node configuration, processor types,
maximum nodes, physical address sizes, etc.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Link: http://lkml.kernel.org/r/20110511175028.GA18006@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-25 14:20:13 +02:00
Kees Cook
7ccafc5f75 x86, cpufeature: Update CPU feature RDRND to RDRAND
The Intel manual changed the name of the CPUID bit to match the
instruction name. We should follow suit for sanity's sake. (See Intel SDM
Volume 2, Table 3-20 "Feature Information Returned in the ECX Register".)

[ hpa: we can only do this at this time because there are currently no CPUs
  with this feature on the market, hence this is pre-hardware enabling.
  However, Cc:'ing stable so that stable can present a consistent ABI. ]

Signed-off-by: Kees Cook <kees.cook@canonical.com>
Link: http://lkml.kernel.org/r/20110524232926.GA27728@outflux.net
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: <stable@kernel.org> v2.6.36-39
2011-05-24 16:37:27 -07:00
5129df03d0 Merge branch 'for-2.6.40' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
* 'for-2.6.40' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
  percpu: Unify input section names
  percpu: Avoid extra NOP in percpu_cmpxchg16b_double
  percpu: Cast away printk format warning
  percpu: Always align percpu output section to PAGE_SIZE

Fix up fairly trivial conflict in arch/x86/include/asm/percpu.h as per Tejun
2011-05-24 11:53:42 -07:00
Suresh Siddha
2f344d2e51 x86, ioapic: Restore ioapic entries during resume properly
In mask/restore_ioapic_entries() we should be restoring ioapic
entries when ioapics[apic].saved_registers is not NULL.

Fix the typo and address the resume hang regression reported by
Linus.

This was not found sooner because the systems where these
changes were tested on kept the IO-APIC entries intact over
resume.

Reported-and-tested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Daniel J Blueman <daniel.blueman@gmail.com>
Link: http://lkml.kernel.org/r/1306259131.7171.7.camel@sbsiddha-MOBL3.sc.intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-24 20:32:41 +02:00
Richard Weinberger
1b4ac2a935 x86: Get rid of asmregparm
As UML does no longer need asmregparm we can remove it.

Signed-off-by: Richard Weinberger <richard@nod.at>
Cc: namhyung@gmail.com
Cc: davem@davemloft.net
Cc: fweisbec@gmail.com
Cc: dhowells@redhat.com
Link: http://lkml.kernel.org/r/%3C1306189085-29896-1-git-send-email-richard%40nod.at%3E
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-05-24 14:33:35 +02:00
Tejun Heo
6988f20fe0 Merge branch 'fixes-2.6.39' into for-2.6.40 2011-05-24 09:59:36 +02:00
5e152b4c9e Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (27 commits)
  PCI: Don't use dmi_name_in_vendors in quirk
  PCI: remove unused AER functions
  PCI/sysfs: move bus cpuaffinity to class dev_attrs
  PCI: add rescan to /sys/.../pci_bus/.../
  PCI: update bridge resources to get more big ranges when allocating space (again)
  KVM: Use pci_store/load_saved_state() around VM device usage
  PCI: Add interfaces to store and load the device saved state
  PCI: Track the size of each saved capability data area
  PCI/e1000e: Add and use pci_disable_link_state_locked()
  x86/PCI: derive pcibios_last_bus from ACPI MCFG
  PCI: add latency tolerance reporting enable/disable support
  PCI: add OBFF enable/disable support
  PCI: add ID-based ordering enable/disable support
  PCI hotplug: acpiphp: assume device is in state D0 after powering on a slot.
  PCI: Set PCIE maxpayload for card during hotplug insertion
  PCI/ACPI: Report _OSC control mask returned on failure to get control
  x86/PCI: irq and pci_ids patch for Intel Panther Point DeviceIDs
  PCI: handle positive error codes
  PCI: check pci_vpd_pci22_wait() return
  PCI: Use ICH6_GPIO_EN in ich6_lpc_acpi_gpio
  ...

Fix up trivial conflicts in include/linux/pci_ids.h: commit a6e5e2be44
moved the intel SMBUS ID definitons to the i2c-i801.c driver.
2011-05-23 15:39:34 -07:00
ea2b50ef4c Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, apic: Include module.h header in apic_flat_64.c
  x86, apic: Make apic drivers static
  x86, apic: Clean up bigsmp apic selection code
  x86, apic: Use .apicdrivers section for the apic drivers list
  x86, apic: Introduce .apicdrivers section to find the list of apic drivers
  x86, x2apic: Move the common bits to x2apic.h
  x86, x2apic: Minimize IPI register writes using cluster groups
  x86, x2apic: Track the x2apic cluster sibling map
  x86, x2apic: Remove duplicate code for IPI mask routines
  x86, apic: Use probe routines to simplify apic selection
  x86, ioapic: Consolidate mp_ioapic_routing[] into 'struct ioapic'
  x86, ioapic: Consolidate gsi routing info into 'struct ioapic'
  x86, ioapic: Consolidate mp_ioapics[] into 'struct ioapic'
  x86, ioapic: Consolidate ioapic_saved_data[] into 'struct ioapic'
  x86, ioapic: Add struct ioapic
  x86, ioapic: Remove duplicate code for saving/restoring RTEs
  x86, ioapic: Use ioapic_saved_data while enabling intr-remapping
  x86, ioapic: Allocate ioapic_saved_data early
  x86, ioapic: Fix potential resume deadlock
2011-05-23 12:54:15 -07:00
Randy Dunlap
b18bf0948e x86, apic: Include module.h header in apic_flat_64.c
apic_flat_64.c needs to include module.h because it uses
EXPORT_SYMBOL_GPL().

This fixes these warnings on some !SMP randconfigs:

  arch/x86/kernel/apic/apic_flat_64.c:31: warning: data definition has no type or storage class
  arch/x86/kernel/apic/apic_flat_64.c:31: warning: type defaults to 'int' in declaration of 'EXPORT_SYMBOL_GPL'
  arch/x86/kernel/apic/apic_flat_64.c:31: warning: parameter names (without types) in function declaration

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Link: http://lkml.kernel.org/r/20110523104300.dd532a99.randy.dunlap@oracle.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-23 21:27:35 +02:00
19504828b4 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  perf tools: Fix sample size bit operations
  perf tools: Fix ommitted mmap data update on remap
  watchdog: Change the default timeout and configure nmi watchdog period based on watchdog_thresh
  watchdog: Disable watchdog when thresh is zero
  watchdog: Only disable/enable watchdog if neccessary
  watchdog: Fix rounding bug in get_sample_period()
  perf tools: Propagate event parse error handling
  perf tools: Robustify dynamic sample content fetch
  perf tools: Pre-check sample size before parsing
  perf tools: Move evlist sample helpers to evlist area
  perf tools: Remove junk code in mmap size handling
  perf tools: Check we are able to read the event size on mmap
2011-05-23 09:25:52 -07:00
57d19e80f4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits)
  b43: fix comment typo reqest -> request
  Haavard Skinnemoen has left Atmel
  cris: typo in mach-fs Makefile
  Kconfig: fix copy/paste-ism for dell-wmi-aio driver
  doc: timers-howto: fix a typo ("unsgined")
  perf: Only include annotate.h once in tools/perf/util/ui/browsers/annotate.c
  md, raid5: Fix spelling error in comment ('Ofcourse' --> 'Of course').
  treewide: fix a few typos in comments
  regulator: change debug statement be consistent with the style of the rest
  Revert "arm: mach-u300/gpio: Fix mem_region resource size miscalculations"
  audit: acquire creds selectively to reduce atomic op overhead
  rtlwifi: don't touch with treewide double semicolon removal
  treewide: cleanup continuations and remove logging message whitespace
  ath9k_hw: don't touch with treewide double semicolon removal
  include/linux/leds-regulator.h: fix syntax in example code
  tty: fix typo in descripton of tty_termios_encode_baud_rate
  xtensa: remove obsolete BKL kernel option from defconfig
  m68k: fix comment typo 'occcured'
  arch:Kconfig.locks Remove unused config option.
  treewide: remove extra semicolons
  ...
2011-05-23 09:12:26 -07:00
d7ef64a9f9 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: Eliminate various 'set but not used' warnings
  x86, SMEP: Fix section mismatch warnings
  x86, amd: Use _safe() msr access for GartTlbWlk disable code
2011-05-23 08:51:55 -07:00
f4b10bc60a Merge branch 'kvm-updates/2.6.40' of git://git.kernel.org/pub/scm/virt/kvm/kvm
* 'kvm-updates/2.6.40' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (131 commits)
  KVM: MMU: Use ptep_user for cmpxchg_gpte()
  KVM: Fix kvm mmu_notifier initialization order
  KVM: Add documentation for KVM_CAP_NR_VCPUS
  KVM: make guest mode entry to be rcu quiescent state
  KVM: x86 emulator: Make jmp far emulation into a separate function
  KVM: x86 emulator: Rename emulate_grpX() to em_grpX()
  KVM: x86 emulator: Remove unused arg from emulate_pop()
  KVM: x86 emulator: Remove unused arg from writeback()
  KVM: x86 emulator: Remove unused arg from read_descriptor()
  KVM: x86 emulator: Remove unused arg from seg_override()
  KVM: Validate userspace_addr of memslot when registered
  KVM: MMU: Clean up gpte reading with copy_from_user()
  KVM: PPC: booke: add sregs support
  KVM: PPC: booke: save/restore VRSAVE (a.k.a. USPRG0)
  KVM: PPC: use ticks, not usecs, for exit timing
  KVM: PPC: fix exit accounting for SPRs, tlbwe, tlbsx
  KVM: PPC: e500: emulate SVR
  KVM: VMX: Cache vmcs segment fields
  KVM: x86 emulator: consolidate segment accessors
  KVM: VMX: Avoid reading %rip unnecessarily when handling exceptions
  ...
2011-05-23 08:42:08 -07:00
Mandeep Singh Baines
4eec42f392 watchdog: Change the default timeout and configure nmi watchdog period based on watchdog_thresh
Before the conversion of the NMI watchdog to perf event, the
watchdog timeout was 5 seconds. Now it is 60 seconds. For my
particular application, netbooks, 5 seconds was a better
timeout. With a short timeout, we catch faults earlier and are
able to send back a panic. With a 60 second timeout, the user is
unlikely to wait and will instead hit the power button, causing
us to lose the panic info.

This change configures the NMI period to watchdog_thresh and
sets the softlockup_thresh to watchdog_thresh * 2. In addition,
watchdog_thresh was reduced to 10 seconds as suggested by Ingo
Molnar.

Signed-off-by: Mandeep Singh Baines <msb@chromium.org>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1306127423-3347-4-git-send-email-msb@chromium.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <20110517071642.GF22305@elte.hu>
2011-05-23 11:58:59 +02:00
82da65dab5 x86: setup_smep needs to be __cpuinit
The setup_smep function gets calle at resume time too, and is thus not a
pure __init function.  When marked as __init, it gets thrown out after
the kernel has initialized, and when the kernel is suspended and
resumed, the code will no longer be around, and we'll get a nice "kernel
tried to execute NX-protected page" oops because the page is no longer
marked executable.

Reported-and-tested-by: Parag Warudkar <parag.lkml@gmail.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: "H. Peter Anvin" <hpa@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-22 21:44:13 -07:00
Takuya Yoshikawa
c8cfbb555e KVM: MMU: Use ptep_user for cmpxchg_gpte()
The address of the gpte was already calculated and stored in ptep_user
before entering cmpxchg_gpte().

This patch makes cmpxchg_gpte() to use that to make it clear that we
are using the same address during walk_addr_generic().

Note that the unlikely annotations are used to show that the conditions
are something unusual rather than for performance.

Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-05-22 08:48:14 -04:00
Takuya Yoshikawa
d2f62766d5 KVM: x86 emulator: Make jmp far emulation into a separate function
We introduce em_jmp_far().

We also call this from em_grp45() to stop treating modrm_reg == 5 case
separately in the group 5 emulation.

Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-22 08:48:06 -04:00
Takuya Yoshikawa
51187683cb KVM: x86 emulator: Rename emulate_grpX() to em_grpX()
The prototypes are changed appropriately.

We also replaces "goto grp45;" with simple em_grp45() call.

Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-22 08:48:05 -04:00
Takuya Yoshikawa
3b9be3bf2e KVM: x86 emulator: Remove unused arg from emulate_pop()
The opt of emulate_grp1a() is also removed.

Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-22 08:48:03 -04:00
Takuya Yoshikawa
adddcecf92 KVM: x86 emulator: Remove unused arg from writeback()
Remove inline at this chance.

Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-22 08:48:00 -04:00
Takuya Yoshikawa
509cf9fe11 KVM: x86 emulator: Remove unused arg from read_descriptor()
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-22 08:47:59 -04:00
Takuya Yoshikawa
c1ed6dea81 KVM: x86 emulator: Remove unused arg from seg_override()
In addition, one comma at the end of a statement is replaced with a
semicolon.

Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-22 08:47:57 -04:00
Takuya Yoshikawa
fa3d315a4c KVM: Validate userspace_addr of memslot when registered
This way, we can avoid checking the user space address many times when
we read the guest memory.

Although we can do the same for write if we check which slots are
writable, we do not care write now: reading the guest memory happens
more often than writing.

[avi: change VERIFY_READ to VERIFY_WRITE]

Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-22 08:47:56 -04:00
Takuya Yoshikawa
12cb814f3b KVM: MMU: Clean up gpte reading with copy_from_user()
When we optimized walk_addr_generic() by not using the generic guest
memory reader, we replaced copy_from_user() with get_user():

  commit e30d2a170506830d5eef5e9d7990c5aedf1b0a51
  KVM: MMU: Optimize guest page table walk

  commit 15e2ac9a43d4d7d08088e404fddf2533a8e7d52e
  KVM: MMU: Fix 64-bit paging breakage on x86_32

But as Andi pointed out later, copy_from_user() does the same as
get_user() as long as we give a constant size to it.

So we use copy_from_user() to clean up the code.

The only, noticeable, regression introduced by this is 64-bit gpte
reading on x86_32 hosts needed for PAE guests.

But this can be mitigated by implementing 8-byte get_user() for x86_32,
if needed.

Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-22 08:47:54 -04:00
Avi Kivity
2fb92db1ec KVM: VMX: Cache vmcs segment fields
Since the emulator now checks segment limits and access rights, it
generates a lot more accesses to the vmcs segment fields.  Undo some
of the performance hit by cacheing those fields in a read-only cache
(the entire cache is invalidated on any write, or on guest exit).

Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-22 08:47:45 -04:00
Avi Kivity
1aa366163b KVM: x86 emulator: consolidate segment accessors
Instead of separate accessors for the segment selector and cached descriptor,
use one accessor for both.  This simplifies the code somewhat.

Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-22 08:47:39 -04:00
Avi Kivity
0a434bb2bf KVM: VMX: Avoid reading %rip unnecessarily when handling exceptions
Avoids a VMREAD.

Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-22 08:40:01 -04:00
Joe Perches
ae8cc05955 KVM: SVM: Make dump_vmcb static, reduce text
dump_vmcb isn't used outside this module, make it static.
Shrink text and object by ~1% by standardizing formats.

$ size arch/x86/kvm/svm.o*
   text	   data	    bss	    dec	    hex	filename
  52910	    580	  10072	  63562	   f84a	arch/x86/kvm/svm.o.new
  53563	    580	  10072	  64215	   fad7	arch/x86/kvm/svm.o.old

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-22 08:40:00 -04:00
Takuya Yoshikawa
8f74d8e168 KVM: MMU: Fix 64-bit paging breakage on x86_32
Fix regression introduced by
  commit e30d2a170506830d5eef5e9d7990c5aedf1b0a51
  KVM: MMU: Optimize guest page table walk

On x86_32, get_user() does not support 64-bit values and we fail to
build KVM at the point of 64-bit paging.

This patch fixes this by using get_user() twice for that condition.

Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Reported-by: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-22 08:39:58 -04:00
BrillyWu@viatech.com.cn
4429d5dc11 KVM: Add CPUID support for VIA CPU
The CPUIDs for Centaur are added, and then the features of
PadLock hardware engine on VIA CPU, such as "ace", "ace_en"
and so on, can be passed into the kvm guest.

Signed-off-by: Brilly Wu <brillywu@viatech.com.cn>
Signed-off-by: Kary Jin <karyjin@viatech.com.cn>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-22 08:39:55 -04:00
Gleb Natapov
2aab2c5b2b KVM: call cache_all_regs() only once during instruction emulation
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-22 08:39:54 -04:00
Gleb Natapov
0004c7c257 KVM: Fix compound mmio
mmio_index should be taken into account when copying data from
userspace.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-22 08:39:52 -04:00
Gleb Natapov
4947e7cd0e KVM: emulator: Propagate fault in far jump emulation
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-22 08:39:51 -04:00