Commit graph

199058 commits

Author SHA1 Message Date
Linus Torvalds
d1e0fe252e Merge branch 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging
* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging: (23 commits)
  hwmon: (lm75) Add support for the Texas Instruments TMP105
  hwmon: (ltc4245) Read only one GPIO pin
  hwmon: (dme1737) Add SCH5127 support
  hwmon: (tmp102) Don't always stop chip at exit
  hwmon: (tmp102) Fix suspend and resume functions
  hwmon: (tmp102) Various fixes
  hwmon: Driver for TI TMP102 temperature sensor
  hwmon: EMC1403 thermal sensor support
  hwmon: (applesmc) Add temperature sensor labels to sysfs interface
  hwmon: (applesmc) Add generic support for MacBook Pro 7
  hwmon: (applesmc) Add generic support for MacBook Pro 6
  hwmon: (applesmc) Add support for MacBook Pro 5,3 and 5,4
  hwmon: (tmp401) Reorganize code to get rid of static forward declarations
  hwmon: (tmp401) Use constants for sysfs file permissions
  hwmon: (adm1031) Allow setting update rate
  hwmon: Add description of the update_rate sysfs attribute
  hwmon: (lm90) Use programmed update rate
  hwmon: (f71882fg) Acquire I/O regions while we're working with them
  hwmon: (f71882fg) Code cleanup
  hwmon: (f71882fg) Use strict_stro(l|ul) instead of simple_strto$1
  ...
2010-05-27 11:33:46 -07:00
Shubhrajyoti Datta
6d034059ee hwmon: (lm75) Add support for the Texas Instruments TMP105
Add support for the Texas Instruments TMP105 temperature sensor
device.

Signed-off-by: Shubhrajyoti Datta <shubhrajyoti@ti.com>
Acked-by: Jonathan Cameron <jic23@cam.ac.uk>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2010-05-27 19:59:03 +02:00
Ira W. Snyder
df16dd53c5 hwmon: (ltc4245) Read only one GPIO pin
Read only one of the GPIO pins as an analog voltage. The ADC can be
switched to a different GPIO pin at runtime, but this is not supported.

Previously, this driver would report the analog voltage of the currently
selected GPIO pin as all three GPIO voltages: in9_input, in10_input and
in11_input.

Signed-off-by: Ira W. Snyder <iws@ovro.caltech.edu>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: stable@kernel.org
2010-05-27 19:59:02 +02:00
Juerg Haefliger
ea694431f9 hwmon: (dme1737) Add SCH5127 support
Add support for the hardware monitoring capabilities of the SCH5127
chip to the dme1737 driver.

Signed-off-by: Juerg Haefliger <juergh@gmail.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Tested-by: Jeff Rickman <jrickman@myamigos.us>
2010-05-27 19:59:01 +02:00
Jean Delvare
38806bda6b hwmon: (tmp102) Don't always stop chip at exit
Only stop the chip at driver exit if it was stopped when driver was
loaded. Leave it running otherwise.

Also restore the device configuration if probe failed, to not leave
the system in a dangling state.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Steven King <sfking@fdwdc.com>
2010-05-27 19:58:59 +02:00
Jean Delvare
8d4dee98b1 hwmon: (tmp102) Fix suspend and resume functions
Suspend and resume functions shouldn't overwrite the configuration
register. They should only alter the one bit they have to touch.

Also don't assume that register reads and writes always succeed.
Handle errors properly, shall they happen.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Steven King <sfking@fdwdc.com>
2010-05-27 19:58:58 +02:00
Jean Delvare
cff37c9e82 hwmon: (tmp102) Various fixes
Fixes from my driver review:
http://lists.lm-sensors.org/pipermail/lm-sensors/2010-March/028051.html

Only the small changes are in there, more important changes will come
later separately as time permits.

* Drop the remnants of the now gone detect function
* The TMP102 has no known compatible chip
* Include the right header files
* Clarify why byte swapping of register values is needed
* Strip resolution info bit from temperature register value
* Set cache lifetime to 1/3 second
* Don't arbitrarily reject limit values; clamp as needed
* Make limit writing unconditional
* Don't check for transaction types the driver doesn't use
* Properly check for error when setting configuration
* Report error on failed probe
* Make the driver load automatically where needed
* Various other minor fixes

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Steven King <sfking@fdwdc.com>
2010-05-27 19:58:57 +02:00
Steven King
beb1b6bbf2 hwmon: Driver for TI TMP102 temperature sensor
Driver for the TI TMP102.

The TI TMP102 is similar to the LM75.  It differs from the LM75 by
having a 16-bit conf register and the temp registers have a minimum
resolution of 12 bits; the extended conf register can select 13-bit
resolution (which this driver does) and also change the update rate
(which this driver currently doesn't use).

[JD: Fix tmp102_exit tag, must be __exit, not __init.]

Signed-off-by: Steven King <sfking@fdwdc.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2010-05-27 19:58:56 +02:00
Kalhan Trisal
dac6831e67 hwmon: EMC1403 thermal sensor support
Provides support for the EMC1403 thermal sensor. Only reporting of values
is supported. The various Moorestown specific extras to do with thermal
alerts and the like are not in this version of the driver.

Considerably edited and tidied up by Alan Cox, plus fixes and detection
bits from Jean Delvare.

Signed-off-by: Kalhan Trisal <kalhan.trisal@intel.com>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2010-05-27 19:58:56 +02:00
Alex Murray
fa5575cff9 hwmon: (applesmc) Add temperature sensor labels to sysfs interface
The Apple SMC uses a systematic labeling scheme for the hardware
temperature sensors. This scheme is currently hidden from
userland. Since the sensor set, and consequently the numbering,
differs between models, an extensive database of configurations is
required for an application such as fan control. This patch adds the
SMC labels to the hwmon sysfs interface, allowing applications to use
the sensors more intelligibly.

[rydberg@euromail.se: fixed error handling]
Signed-off-by: Alex Murray <murray.alex@gmail.com>
Signed-off-by: Henrik Rydberg <rydberg@euromail.se>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2010-05-27 19:58:54 +02:00
Henrik Rydberg
405eaa1c1d hwmon: (applesmc) Add generic support for MacBook Pro 7
This patch adds generic support for the MacBook Pro 7 family
based on the 7,1 model.

Signed-off-by: Henrik Rydberg <rydberg@euromail.se>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2010-05-27 19:58:53 +02:00
Bernhard Froemel
872bad55e2 hwmon: (applesmc) Add generic support for MacBook Pro 6
This patch adds generic support for the MacBook Pro 6 family
based on the 6,2 model.

[rydberg@euromail.se: patch cleanup]
Signed-off-by: Bernhard Froemel <froemel@vmars.tuwien.ac.at>
Signed-off-by: Henrik Rydberg <rydberg@euromail.se>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2010-05-27 19:58:52 +02:00
Henrik Rydberg
4e4a99d327 hwmon: (applesmc) Add support for MacBook Pro 5,3 and 5,4
The MacBookPro 5,3 model has two fans, whereas the 5,4 model has
only one. This patch adds explicit support for the 5,3 and 5,4 models.

Signed-off-by: Henrik Rydberg <rydberg@euromail.se>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2010-05-27 19:58:50 +02:00
Andre Prendel
ea63c2b91f hwmon: (tmp401) Reorganize code to get rid of static forward declarations
Signed-off-by: Andre Prendel <andre.prendel@gmx.de>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2010-05-27 19:58:49 +02:00
Andre Prendel
2b76d80adc hwmon: (tmp401) Use constants for sysfs file permissions
Replace octal representation of file permissions by the corresponding
constants.

Signed-off-by: Andre Prendel <andre.prendel@gmx.de>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2010-05-27 19:58:48 +02:00
Jean Delvare
87c33daadb hwmon: (adm1031) Allow setting update rate
Based on earlier work by Ira W. Snyder.

The adm1031 chip is capable of using a runtime configurable sampling rate,
using the fan filter register. Add support for reading and setting the
update rate via sysfs.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Ira W. Snyder <iws@ovro.caltech.edu>
2010-05-27 19:58:46 +02:00
Ira W. Snyder
d2b847d489 hwmon: Add description of the update_rate sysfs attribute
The update_rate attribute can be used by drivers to let userspace choose
the update rate of the chip, if it is configurable.

Signed-off-by: Ira W. Snyder <iws@ovro.caltech.edu>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2010-05-27 19:58:45 +02:00
Ira W. Snyder
8c3c7a256f hwmon: (lm90) Use programmed update rate
The lm90 driver programs the sensor chip to update its readings at 2 Hz
(500 ms between readings). However, the driver only does reads from the
chip at intervals of 2 * HZ (2000 ms between readings). Change the driver
update rate to the programmed update rate.

Signed-off-by: Ira W. Snyder <iws@ovro.caltech.edu>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2010-05-27 19:58:44 +02:00
Giel van Schijndel
729d273aa7 hwmon: (f71882fg) Acquire I/O regions while we're working with them
Acquire the I/O region for the Super I/O chip while we're working on it.

Signed-off-by: Giel van Schijndel <me@mortis.eu>
Cc: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2010-05-27 19:58:43 +02:00
Giel van Schijndel
bd328acdc6 hwmon: (f71882fg) Code cleanup
Some code cleanup: properly use previously defined functions, rather
than duplicating their code.

Signed-off-by: Giel van Schijndel <me@mortis.eu>
Cc: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2010-05-27 19:58:42 +02:00
Giel van Schijndel
e8a4eacaa9 hwmon: (f71882fg) Use strict_stro(l|ul) instead of simple_strto$1
Use the strict_strol and strict_stroul functions instead of simple_strol
and simple_stroul respectively in sysfs functions.

Signed-off-by: Giel van Schijndel <me@mortis.eu>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2010-05-27 19:58:41 +02:00
Giel van Schijndel
162bb59e49 hwmon: (f71882fg) Fixed braces coding style issues
Fixed several coding style issues.

Signed-off-by: Giel van Schijndel <me@mortis.eu>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2010-05-27 19:58:40 +02:00
Matthew Garrett
10f2ed31aa hwmon: (lm63) Add basic support for LM64
The LM64 appears to be an LM63 with added GPIO lines. Add support for the
hwmon functionality - GPIO can be added at some later stage if someone
has a need for them.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2010-05-27 19:58:38 +02:00
Jean Delvare
70dd6beac0 hwmon: (asus_atk0110) Don't load if ACPI resources aren't enforced
When the user passes the kernel parameter acpi_enforce_resources=lax,
the ACPI resources are no longer protected, so a native driver can
make use of them. In that case, we do not want the asus_atk0110 to be
loaded. Unfortunately, this driver loads automatically due to its
MODULE_DEVICE_TABLE, so the user ends up with two drivers loaded for
the same device - this is bad.

So I suggest that we prevent the asus_atk0110 driver from loading if
acpi_enforce_resources=lax.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Luca Tettamanti <kronos.it@gmail.com>
Cc: Len Brown <lenb@kernel.org>
2010-05-27 19:58:37 +02:00
Linus Torvalds
cc106eb35e Merge branch 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6
* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6:
  [S390] fill out file list in s390 MAINTAINERS entry
  [S390] Add support for LZO-compressed kernels.
  [S390] cmm: get rid of CMM_PROC config option
  [S390] cmm: remove superfluous EXPORT_SYMBOLs plus cleanups
  [S390] dasd: unit check handling during internal cio I/O
  [S390] cio: unit check handling during internal I/O
  [S390] ccwgroup: add locking around drvdata access
  [S390] cio: remove stsch
  [S390] spp: remove KVM_AWARE_CMF config option
  [S390] kprobes: forbid probing of stnsm/stosm/epsw
  [S390] spp: fix compilation for CONFIG_32BIT
  [S390] atomic: implement atomic64_dec_if_positive
  [S390] cmm: fix crash on module unload
2010-05-27 10:48:46 -07:00
Linus Torvalds
4e455c6782 Merge branch 'sfi-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-sfi-2.6
* 'sfi-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-sfi-2.6:
  SFI: add sysfs interface for SFI tables.
  SFI: add support for v0.81 spec
2010-05-27 10:47:41 -07:00
Linus Torvalds
105a048a4f Merge git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable
* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: (27 commits)
  Btrfs: add more error checking to btrfs_dirty_inode
  Btrfs: allow unaligned DIO
  Btrfs: drop verbose enospc printk
  Btrfs: Fix block generation verification race
  Btrfs: fix preallocation and nodatacow checks in O_DIRECT
  Btrfs: avoid ENOSPC errors in btrfs_dirty_inode
  Btrfs: move O_DIRECT space reservation to btrfs_direct_IO
  Btrfs: rework O_DIRECT enospc handling
  Btrfs: use async helpers for DIO write checksumming
  Btrfs: don't walk around with task->state != TASK_RUNNING
  Btrfs: do aio_write instead of write
  Btrfs: add basic DIO read/write support
  direct-io: do not merge logically non-contiguous requests
  direct-io: add a hook for the fs to provide its own submit_bio function
  fs: allow short direct-io reads to be completed via buffered IO
  Btrfs: Metadata ENOSPC handling for balance
  Btrfs: Pre-allocate space for data relocation
  Btrfs: Metadata ENOSPC handling for tree log
  Btrfs: Metadata reservation for orphan inodes
  Btrfs: Introduce global metadata reservation
  ...
2010-05-27 10:43:44 -07:00
Linus Torvalds
00b9b0af58 Avoid warning when CPU hotplug isn't enabled
Commit e9fb7631eb ("cpu-hotplug: introduce cpu_notify(),
__cpu_notify(), cpu_notify_nofail()") also introduced this annoying
warning:

  kernel/cpu.c:157: warning: 'cpu_notify_nofail' defined but not used

when CONFIG_HOTPLUG_CPU wasn't set.

So move that helper inside the #ifdef CONFIG_HOTPLUG_CPU region, and
simplify it while at it.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27 10:32:08 -07:00
Linus Torvalds
e2e2400bd4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6:
  [SCSI] fix race in scsi_target_reap
  [SCSI] aacraid: Eliminate use after free
  [SCSI] arcmsr: Support HW reset for EH and polling scheme for scsi device
  [SCSI] bfa: fix system crash when reading sysfs fc_host statistics
  [SCSI] iscsi_tcp: remove sk_sleep check
  [SCSI] ipr: improve interrupt service routine performance
  [SCSI] ipr: set the data list length in the request control block
  [SCSI] ipr: fix a register read to use the correct address for 64 bit adapters
  [SCSI] ipr: include the resource path in the IOA status area structure
  [SCSI] ipr: implement fixes for 64 bit adapter support
  [SCSI] be2iscsi: correct return value in mgmt_invalidate_icds()
2010-05-27 10:28:11 -07:00
Linus Torvalds
e4ce30f377 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (40 commits)
  ext4: Make fsync sync new parent directories in no-journal mode
  ext4: Drop whitespace at end of lines
  ext4: Fix compat EXT4_IOC_ADD_GROUP
  ext4: Conditionally define compat ioctl numbers
  tracing: Convert more ext4 events to DEFINE_EVENT
  ext4: Add new tracepoints to track mballoc's buddy bitmap loads
  ext4: Add a missing trace hook
  ext4: restart ext4_ext_remove_space() after transaction restart
  ext4: Clear the EXT4_EOFBLOCKS_FL flag only when warranted
  ext4: Avoid crashing on NULL ptr dereference on a filesystem error
  ext4: Use bitops to read/modify i_flags in struct ext4_inode_info
  ext4: Convert calls of ext4_error() to EXT4_ERROR_INODE()
  ext4: Convert callers of ext4_get_blocks() to use ext4_map_blocks()
  ext4: Add new abstraction ext4_map_blocks() underneath ext4_get_blocks()
  ext4: Use our own write_cache_pages()
  ext4: Show journal_checksum option
  ext4: Fix for ext4_mb_collect_stats()
  ext4: check for a good block group before loading buddy pages
  ext4: Prevent creation of files larger than RLIMIT_FSIZE using fallocate
  ext4: Remove extraneous newlines in ext4_msg() calls
  ...

Fixed up trivial conflict in fs/ext4/fsync.c
2010-05-27 10:26:37 -07:00
Linus Torvalds
b899ebeb05 Merge branch 'for-linus/2634-git-updates' of git://git.fluff.org/bjdooks/linux
* 'for-linus/2634-git-updates' of git://git.fluff.org/bjdooks/linux:
  ARM: S5PC100: Fixup cross tree merge problems
  ARM: S5P: Fix the platform external interrupt issues.
  ARM: s5pv210_defconfig: Update s5pv210_defconfig to v2.6.34-git
  ARM: s5pc110_defconfig: Update s5pc110_defconfig to v2.6.34-git
  ARM: s5pc100_defconfig: Update s5pc100_defconfig to v2.6.34-git
  ARM: s5p6442_defconfig: Update s5p6442_defconfig to v2.6.34-git
  ARM: s5p6440_defconfig: Update s5p6440_defconfig to v2.6.34-git
  ARM: s3c6400_defconfig: Update s3c6400_defconfig to v2.6.34-git
  ARM: s3c2410_defconfig: Update s3c2410_defconfig to v2.6.34-git
2010-05-27 10:23:57 -07:00
Linus Torvalds
55ddf14b04 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
  ieee1394: schedule for removal
  firewire: core: use separate timeout for each transaction
  firewire: core: Fix tlabel exhaustion problem
  firewire: core: make transaction label allocation more robust
  firewire: core: clean up config ROM related defined constants
  ieee1394: mark char device files as not seekable
  firewire: cdev: mark char device files as not seekable
  firewire: ohci: cleanups and fix for nonstandard build without debug facility
  firewire: ohci: wait for PHY register accesses to complete
  firewire: ohci: fix up configuration of TI chips
  firewire: ohci: enable 1394a enhancements
  firewire: ohci: do not clear PHY interrupt status inadvertently
  firewire: ohci: add a function for reading PHY registers

Trivial conflicts in Documentation/feature-removal-schedule.txt
2010-05-27 10:22:06 -07:00
Linus Torvalds
a9a0aff5b5 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k: (24 commits)
  m68k: amiga - RTC platform device conversion
  m68k: amiga - Parallel port platform device conversion
  m68k: amiga - Serial port platform device conversion
  m68k: amiga - Mouse platform device conversion
  m68k: amiga - Keyboard platform device conversion
  m68k: amiga - Amiga Gayle IDE platform device conversion
  m68k: amiga - A4000T SCSI platform device conversion
  m68k/scsi: a3000 - Do not use legacy Scsi_Host.base
  m68k: amiga - A3000 SCSI platform device conversion
  m68k/scsi: gvp11 - Do not use legacy Scsi_Host.base
  m68k: amiga - GVP Series II SCSI zorro_driver conversion
  m68k/scsi: a2091 - Do not use legacy Scsi_Host.base
  m68k: amiga - A2091/A590 SCSI zorro_driver conversion
  m68k/scsi: mvme147 - Kill obsolete HOSTS_C logic
  m68k/scsi: a3000 - Kill a3000_scsiregs typedef
  m68k/scsi: gvp11 - Kill gvp11_scsiregs typedef
  m68k/scsi: a2091 - Kill a2091_scsiregs typedef
  m68k/scsi: gvp11 - Extract check_wd33c93()
  m68k/scsi: a3000 - Kill static global a3000_host
  m68k/scsi: mvme147 - Kill static global mvme147_host
  ...
2010-05-27 10:19:19 -07:00
Linus Torvalds
ade61088bc Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
  NFS: Fix another nfs_wb_page() deadlock
  NFS: Ensure that we mark the inode as dirty if we exit early from commit
  NFS: Fix a lock imbalance typo in nfs_access_cache_shrinker
  sunrpc: fix leak on error on socket xprt setup
2010-05-27 10:18:44 -07:00
Feng Tang
dce80a5626 SFI: add sysfs interface for SFI tables.
Analogous to ACPI's /sys/firmware/acpi/tables/...

create /sys/firmware/sfi/tables/

The tables are primariy for the kernel,
but sometimes it is useful for user-space to be
able to read them.

Signed-off-by: Feng Tang <feng.tang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2010-05-27 12:46:20 -04:00
Linus Torvalds
7eb1053fd0 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: usbtouchscreen - support bigger iNexio touchscreens
  Input: ads7846 - return error on regulator_get() failure
  Input: twl4030-vibra - correct the power down sequence
  Input: enable onkey driver of max8925
  Input: use ABS_CNT rather than (ABS_MAX + 1)
2010-05-27 09:19:55 -07:00
Vasily Khoruzhick
03a3f695cb Input: s3c2410_ts - restore accidentially dropped s3c24xx ids
Without s3c24xx ids driver doesn't attach on s3c2410 and s3c244x

Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Acked-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27 09:19:31 -07:00
Lee Schermerhorn
b9498bfe86 numa: update Documentation/vm/numa, add memoryless node info
Kamezawa Hiroyuki requested documentation for the numa_mem_id() and slab
related changes.  He suggested Documentation/vm/numa for this
documentation.  Looking at this file, it seems to me to be hopelessly out
of date relative to current Linux NUMA support.  At the risk of going down
a rathole, I have made an attempt to rewrite the doc at a slightly higher
level [I think] and provide pointers to other in-tree documents and
out-of-tree man pages that cover the details.

Let the games begin.

Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Nick Piggin <npiggin@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Eric Whitney <eric.whitney@hp.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27 09:12:57 -07:00
Lee Schermerhorn
3dd6b5fb43 numa: in-kernel profiling: use cpu_to_mem() for per cpu allocations
In kernel profiling requires that we be able to allocate "local" memory
for each cpu.  Use "cpu_to_mem()" instead of "cpu_to_node()" to support
memoryless nodes.

Depends on the "numa_mem_id()" patch.

Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Nick Piggin <npiggin@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Eric Whitney <eric.whitney@hp.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27 09:12:57 -07:00
Lee Schermerhorn
7d6e6d09de numa: slab: use numa_mem_id() for slab local memory node
Example usage of generic "numa_mem_id()":

The mainline slab code, since ~ 2.6.19, does not handle memoryless nodes
well.  Specifically, the "fast path"--____cache_alloc()--will never
succeed as slab doesn't cache offnode object on the per cpu queues, and
for memoryless nodes, all memory will be "off node" relative to
numa_node_id().  This adds significant overhead to all kmem cache
allocations, incurring a significant regression relative to earlier
kernels [from before slab.c was reorganized].

This patch uses the generic topology function "numa_mem_id()" to return
the "effective local memory node" for the calling context.  This is the
first node in the local node's generic fallback zonelist-- the same node
that "local" mempolicy-based allocations would use.  This lets slab cache
these "local" allocations and avoid fallback/refill on every allocation.

N.B.: Slab will need to handle node and memory hotplug events that could
change the value returned by numa_mem_id() for any given node if recent
changes to address memory hotplug don't already address this.  E.g., flush
all per cpu slab queues before rebuilding the zonelists while the
"machine" is held in the stopped state.

Performance impact on "hackbench 400 process 200"

2.6.34-rc3-mmotm-100405-1609		no-patch	this-patch
ia64 no memoryless nodes [avg of 10]:     11.713       11.637  ~0.65 diff
ia64 cpus all on memless nodes  [10]:    228.259       26.484  ~8.6x speedup

The slowdown of the patched kernel from ~12 sec to ~28 seconds when
configured with memoryless nodes is the result of all cpus allocating from
a single node's mm pagepool.  The cache lines of the single node are
distributed/interleaved over the memory of the real physical nodes, but
the zone lock, list heads, ...  of the single node with memory still each
live in a single cache line that is accessed from all processors.

x86_64 [8x6 AMD] [avg of 40]:		2.883	   2.845

Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Nick Piggin <npiggin@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Eric Whitney <eric.whitney@hp.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27 09:12:57 -07:00
Lee Schermerhorn
fd1197f113 numa: ia64: support numa_mem_id() for memoryless nodes
Enable 'HAVE_MEMORYLESS_NODES' by default when NUMA configured on ia64.
Initialize percpu 'numa_mem' variable when starting secondary cpus.
Generic initialization will handle the boot cpu.

Nothing uses 'numa_mem_id()' yet.  Subsequent patch with modify slab to
use this.

Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Nick Piggin <npiggin@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Eric Whitney <eric.whitney@hp.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27 09:12:57 -07:00
Lee Schermerhorn
7aac789885 numa: introduce numa_mem_id()- effective local memory node id
Introduce numa_mem_id(), based on generic percpu variable infrastructure
to track "nearest node with memory" for archs that support memoryless
nodes.

Define API in <linux/topology.h> when CONFIG_HAVE_MEMORYLESS_NODES
defined, else stubs.  Architectures will define HAVE_MEMORYLESS_NODES
if/when they support them.

Archs can override definitions of:

numa_mem_id() - returns node number of "local memory" node
set_numa_mem() - initialize [this cpus'] per cpu variable 'numa_mem'
cpu_to_mem()  - return numa_mem for specified cpu; may be used as lvalue

Generic initialization of 'numa_mem' occurs in __build_all_zonelists().
This will initialize the boot cpu at boot time, and all cpus on change of
numa_zonelist_order, or when node or memory hot-plug requires zonelist
rebuild.  Archs that support memoryless nodes will need to initialize
'numa_mem' for secondary cpus as they're brought on-line.

[akpm@linux-foundation.org: fix build]
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Signed-off-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Nick Piggin <npiggin@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Eric Whitney <eric.whitney@hp.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27 09:12:57 -07:00
Lee Schermerhorn
3bccd99627 numa: ia64: use generic percpu var numa_node_id() implementation
ia64:  Use generic percpu implementation of numa_node_id()
   + intialize per cpu 'numa_node'
   + remove ia64 cpu_to_node() macro;  use generic
   + define CONFIG_USE_PERCPU_NUMA_NODE_ID when NUMA configured

Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Reviewed-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Nick Piggin <npiggin@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Eric Whitney <eric.whitney@hp.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27 09:12:57 -07:00
Lee Schermerhorn
e534c7c5f8 numa: x86_64: use generic percpu var numa_node_id() implementation
x86 arch specific changes to use generic numa_node_id() based on generic
percpu variable infrastructure.  Back out x86's custom version of
numa_node_id()

Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Nick Piggin <npiggin@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Eric Whitney <eric.whitney@hp.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27 09:12:57 -07:00
Lee Schermerhorn
7281201922 numa: add generic percpu var numa_node_id() implementation
Rework the generic version of the numa_node_id() function to use the new
generic percpu variable infrastructure.

Guard the new implementation with a new config option:

        CONFIG_USE_PERCPU_NUMA_NODE_ID.

Archs which support this new implemention will default this option to 'y'
when NUMA is configured.  This config option could be removed if/when all
archs switch over to the generic percpu implementation of numa_node_id().
Arch support involves:

  1) converting any existing per cpu variable implementations to use
     this implementation.  x86_64 is an instance of such an arch.
  2) archs that don't use a per cpu variable for numa_node_id() will
     need to initialize the new per cpu variable "numa_node" as cpus
     are brought on-line.  ia64 is an example.
  3) Defining USE_PERCPU_NUMA_NODE_ID in arch dependent Kconfig--e.g.,
     when NUMA is configured.  This is required because I have
     retained the old implementation by default to allow archs to
     be modified incrementally, as desired.

Subsequent patches will convert x86_64 and ia64 to use this implemenation.

Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Reviewed-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Nick Piggin <npiggin@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Eric Whitney <eric.whitney@hp.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27 09:12:57 -07:00
Jan Blunck
866707fc27 Documentation/filesystems/Locking: update documentation on llseek() wrt BKL
The inode's i_size is not protected by the big kernel lock.  Therefore it
does not make sense to recommend taking the BKL in filesystems llseek
operations.  Instead it should use the inode's mutex or use just use
i_size_read() instead.  Add a note that this is not protecting
file->f_pos.

Signed-off-by: Jan Blunck <jblunck@suse.de>
Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: John Kacur <jkacur@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27 09:12:56 -07:00
jan Blunck
ca572727db fs/: do not fallback to default_llseek() when readdir() uses BKL
Do not use the fallback default_llseek() if the readdir operation of the
filesystem still uses the big kernel lock.

Since llseek() modifies
file->f_pos of the directory directly it may need locking to not confuse
readdir which usually uses file->f_pos directly as well

Since the special characteristics of the BKL (unlocked on schedule) are
not necessary in this case, the inode mutex can be used for locking as
provided by generic_file_llseek().  This is only possible since all
filesystems, except reiserfs, either use a directory as a flat file or
with disk address offsets.  Reiserfs on the other hand uses a 32bit hash
off the filename as the offset so generic_file_llseek() can get used as
well since the hash is always smaller than sb->s_maxbytes (= (512 << 32) -
blocksize).

Signed-off-by: Jan Blunck <jblunck@suse.de>
Acked-by: Jan Kara <jack@suse.cz>
Acked-by: Anders Larsen <al@alarsen.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27 09:12:56 -07:00
Jan Blunck
b4d878e23c st: use noop_llseek() instead of default_llseek()
st_open() suggests that llseek() doesn't work: "We really want to do
nonseekable_open(inode, filp); here, but some versions of tar incorrectly
call lseek on tapes and bail out if that fails.  So we disallow pread()
and pwrite(), but permit lseeks."

Instead of using the fallback default_llseek() the driver should use
noop_llseek() which leaves the file->f_pos untouched but succeeds.

Signed-off-by: Jan Blunck <jblunck@suse.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Kai Makisara <Kai.Makisara@kolumbus.fi>
Cc: Willem Riede <osst@riede.org>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27 09:12:56 -07:00
Jan Blunck
889e5fbbc2 osst: use noop_llseek() instead of default_llseek()
__os_scsi_tape_open() suggests that llseek() doesn't work: "We really want
to do nonseekable_open(inode, filp); here, but some versions of tar
incorrectly call lseek on tapes and bail out if that fails.  So we
disallow pread() and pwrite(), but permit lseeks."

Instead of using the fallback default_llseek() the driver should use
noop_llseek() which leaves the file->f_pos untouched but succeeds.

Signed-off-by: Jan Blunck <jblunck@suse.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Willem Riede <osst@riede.org>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27 09:12:56 -07:00
jan Blunck
ae6afc3f5c vfs: introduce noop_llseek()
This is an implementation of ->llseek useable for the rare special case
when userspace expects the seek to succeed but the (device) file is
actually not able to perform the seek.  In this case you use noop_llseek()
instead of falling back to the default implementation of ->llseek.

Signed-off-by: Jan Blunck <jblunck@suse.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27 09:12:56 -07:00