linux-hardened

Author	SHA1	Message	Date
Joel Becker	df7f99670a	configfs: Don't try to d_delete() negative dentries. When configfs is faking mkdir() on its subsystem or default group objects, it starts by adding a negative dentry. It then tries to instantiate the group. If that should fail, it must clean up after itself. I was using d_delete() here, but configfs_attach_group() promises to return an empty dentry on error. d_delete() explodes with the entry dentry. Let's try d_drop() instead. The unhashing is what we want for our dentry. Signed-off-by: Joel Becker <jlbec@evilplan.org>	2011-05-18 03:30:58 -07:00
Sunil Mushran	df016c665b	ocfs2/dlm: Target node death during resource migration leads to thread spin During resource migration, if the target node were to die, the thread doing the migration spins until the target node is not removed from the domain map. This patch slows the spin by making the thread wait for the recovery to kick in. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Joel Becker <jlbec@evilplan.org>	2011-05-13 11:27:30 -07:00
Sunil Mushran	10b3dd7611	ocfs2: Skip mount recovery for hard-ro mounts Patch skips mount recovery for hard-ro mounts which otherwise leads to an oops. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Acked-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Joel Becker <jlbec@evilplan.org>	2011-05-13 11:27:14 -07:00
Sunil Mushran	33c12a5436	ocfs2/cluster: Heartbeat mismatch message improved If o2hb finds unexpected values in the heartbeat slot, it prints a message "ERROR: Device "dm-6": another node is heartbeating in our slot!" This message could be misleading. This patch adds two more messages to help users better diagnose the problem. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Acked-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Joel Becker <jlbec@evilplan.org>	2011-05-13 11:27:02 -07:00
Sunil Mushran	76d9fc2954	ocfs2/cluster: Increase the live threshold for global heartbeat We have seen isolated cases (very few, I might add) of o2hb not detecting all live nodes on startup. One plausible reasoning for it is that other node had a hb io delay at the same time. The live threshold set at 2 (as low as it can be) could be increased to ameliorate the situation. But increasing the threshold directly affects mount time. Currently it takes around 5 secs to mount a volume in o2cb cluster with local heartbeat. Increasing the threshold will make mounts even slower. As the issue itself is rare, we have left things as they are for the local heartbeat mode. However we can improve the situation for global heartbeat mode as in that mode, we start the heartbeat much before the mount. This patch doubles the live threshold for the start of the first region in global heartbeat mode. Addresses internal Oracle bug#10635585. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Acked-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Joel Becker <jlbec@evilplan.org>	2011-05-13 11:26:48 -07:00
Sunil Mushran	4da6dc2936	ocfs2/dlm: Use negotiated o2dlm protocol version Patch fixes a bug in the o2dlm protocol negotiation in that it is using the builtin version rather than the negotiated version during the domain join. This causes join errors when a node having kernel >= 2.6.37 joins a cluster with nodes having kernels < 2.6.37. This only affects the o2cb cluster stack. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Reported-by: Jacek Stepniewski <Jacek.Stepniewski@agora.pl> Acked-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Joel Becker <jlbec@evilplan.org>	2011-05-13 11:26:37 -07:00
Tristan Ye	9a790ba1ec	ocfs2: skip existing hole when removing the last extent_rec in punching-hole codes. In the case of removing a partial extent record which covers a hole, current punching-hole logic will try to remove more than the length of whole extent record, which leads to the failure of following assert(fs/ocfs2/alloc.c): 5507 BUG_ON(cpos < le32_to_cpu(rec->e_cpos) \|\| trunc_range > rec_range); This patch tries to skip existing hole at the last attempt of removing a partial extent record, what's more, it also adds some necessary comments for better understanding of punching-hole codes. Signed-off-by: Tristan Ye <tristan.ye@oracle.com> Signed-off-by: Joel Becker <jlbec@evilplan.org>	2011-05-13 11:26:20 -07:00
Marcus Meissner	5d44670fac	ocfs2: Initialize data_ac (might be used uninitialized) CLANG found that there is a path that has data_ac uninitialized, this place 2917 /* This gets us the dx_root */ 2918 ret = ocfs2_reserve_new_metadata_blocks(osb, 1, &meta_ac); 2919 if (ret) { 3 Taking true branch 2920 mlog_errno(ret); 2921 goto out; 4 Control jumps to line 3168 2922 } Goes to the out: label without data_ac being initialized. Ciao, Marcus Signed-Off-By: Marcus Meissner <meissner@suse.de> Signed-off-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Joel Becker <jlbec@evilplan.org>	2011-05-13 11:26:15 -07:00
Linus Torvalds	446cc6345d	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: net/9p/protocol.c: Fix a memory leak	2011-05-12 18:00:09 -07:00
Linus Torvalds	9bbd055808	Merge branch 'for-2639-rc7/i2c-fixes' of git://git.fluff.org/bjdooks/linux * 'for-2639-rc7/i2c-fixes' of git://git.fluff.org/bjdooks/linux: i2c: pnx: Fix crash due to wrong init of timer->data	2011-05-12 16:55:33 -07:00
Wolfram Sang	9ddabb055d	i2c: pnx: Fix crash due to wrong init of timer->data alg_data is already a pointer which must be passed directly. Reported-by: Dieter Ripp <ripp@systecnet.com> Signed-off-by: Wolfram Sang <w.sang@pengutronix.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: Ben Dooks <ben-i2c@fluff.org> Signed-off-by: Ben Dooks <ben-linux@fluff.org>	2011-05-13 00:10:36 +01:00
Ingo Molnar	411f05f123	vsprintf: Turn kptr_restrict off by default kptr_restrict has been triggering bugs in apps such as perf, and it also makes the system less useful by default, so turn it off by default. This is how we generally handle security features that remove functionality, such as firewall code or SELinux - they have to be configured and activated from user-space. Distributions can turn kptr_restrict on again via this line in /etc/sysctrl.conf: kernel.kptr_restrict = 1 ( Also mark the variable __read_mostly while at it, as it's typically modified only once per bootup, or not at all. ) Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-05-12 15:18:16 -07:00
Pedro Scarapicchia Junior	1b0bcbcf62	net/9p/protocol.c: Fix a memory leak When p9pdu_readf() is called with "s" attribute, it allocates a pointer that will store a string. In p9dirent_read(), this pointer is not being released, leading to out of memory errors. This patch releases this pointer after string is copyed to dirent->d_name. Signed-off-by: Pedro Scarapicchia Junior <pedro.scarapiccha@br.flextronics.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2011-05-12 17:05:37 -05:00
Linus Torvalds	ca1376d108	Merge branch 'fix/asoc' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6 * 'fix/asoc' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6: ASoC: WM8903: Fix Digital Capture Volume range ASoC: UDA134x: Remove POWER_OFF_ON_STANDBY define. ASoC: SSM2602: Fix reg_cache_size ASoC: SSM2602: Fix 'Mic Boost2' control ASoC: SSM2602: Properly annotate i2c probe and remove functions ASoC: sst_platform: add hw_free callback to fix resource leak ASoC: Don't crash on PM operations ASoC: JZ4740: Fix i2s shutdown	2011-05-12 12:41:30 -07:00
Linus Torvalds	0c5e1577f1	Merge branch 'stable/bug-fixes-for-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen * 'stable/bug-fixes-for-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: x86/mm: Fix section mismatch derived from native_pagetable_reserve() x86,xen: introduce x86_init.mapping.pagetable_reserve Revert "xen/mmu: Add workaround "x86-64, mm: Put early page table high""	2011-05-12 12:21:51 -07:00
Linus Torvalds	982b2035d9	Revert "drm/i915: Only enable the plane after setting the fb base (pre-ILK)" This reverts commit `49183b2818`. Quoth Franz Melchior: "This patch introduces a bug on my infamous "Acer Travelmate 5735Z-452G32Mnss": when KMS takes over, the frame buffer contents get completely garbled up on screen, with colored stripes and unreadable text (photo on request). Only when X11 is started, the screen gets restored again. Closing and re-opening the lid partly cures the mess, too: it makes the font readable, though horizontally stretched." Acked-by: Keith Packard <keithp@keithp.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-05-12 12:19:43 -07:00
Linus Torvalds	df43938bc5	Merge branch 'fbmem' * fbmem: fbmem: make read/write/ioctl use the frame buffer at open time fbcon: add lifetime refcount to opened frame buffers	2011-05-12 10:42:36 -07:00
Linus Torvalds	49f019c188	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: ads7846 - remove unused variable from struct ads7845_ser_req Input: ads7846 - make transfer buffers DMA safe	2011-05-12 10:41:31 -07:00
Sedat Dilek	53f8023feb	x86/mm: Fix section mismatch derived from native_pagetable_reserve() With CONFIG_DEBUG_SECTION_MISMATCH=y I see these warnings in next-20110415: LD vmlinux.o MODPOST vmlinux.o WARNING: vmlinux.o(.text+0x1ba48): Section mismatch in reference from the function native_pagetable_reserve() to the function .init.text:memblock_x86_reserve_range() The function native_pagetable_reserve() references the function __init memblock_x86_reserve_range(). This is often because native_pagetable_reserve lacks a __init annotation or the annotation of memblock_x86_reserve_range is wrong. This patch fixes the issue. Thanks to pipacs from PaX project for help on IRC. Acked-by: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-05-12 13:05:05 -04:00
Stefano Stabellini	279b706bf8	x86,xen: introduce x86_init.mapping.pagetable_reserve Introduce a new x86_init hook called pagetable_reserve that at the end of init_memory_mapping is used to reserve a range of memory addresses for the kernel pagetable pages we used and free the other ones. On native it just calls memblock_x86_reserve_range while on xen it also takes care of setting the spare memory previously allocated for kernel pagetable pages from RO to RW, so that it can be used for other purposes. A detailed explanation of the reason why this hook is needed follows. As a consequence of the commit: commit `4b239f458c` Author: Yinghai Lu <yinghai@kernel.org> Date: Fri Dec 17 16:58:28 2010 -0800 x86-64, mm: Put early page table high at some point init_memory_mapping is going to reach the pagetable pages area and map those pages too (mapping them as normal memory that falls in the range of addresses passed to init_memory_mapping as argument). Some of those pages are already pagetable pages (they are in the range pgt_buf_start-pgt_buf_end) therefore they are going to be mapped RO and everything is fine. Some of these pages are not pagetable pages yet (they fall in the range pgt_buf_end-pgt_buf_top; for example the page at pgt_buf_end) so they are going to be mapped RW. When these pages become pagetable pages and are hooked into the pagetable, xen will find that the guest has already a RW mapping of them somewhere and fail the operation. The reason Xen requires pagetables to be RO is that the hypervisor needs to verify that the pagetables are valid before using them. The validation operations are called "pinning" (more details in arch/x86/xen/mmu.c). In order to fix the issue we mark all the pages in the entire range pgt_buf_start-pgt_buf_top as RO, however when the pagetable allocation is completed only the range pgt_buf_start-pgt_buf_end is reserved by init_memory_mapping. Hence the kernel is going to crash as soon as one of the pages in the range pgt_buf_end-pgt_buf_top is reused (b/c those ranges are RO). For this reason we need a hook to reserve the kernel pagetable pages we used and free the other ones so that they can be reused for other purposes. On native it just means calling memblock_x86_reserve_range, on Xen it also means marking RW the pagetable pages that we allocated before but that haven't been used before. Another way to fix this is without using the hook is by adding a 'if (xen_pv_domain)' in the 'init_memory_mapping' code and calling the Xen counterpart, but that is just nasty. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Yinghai Lu <yinghai@kernel.org> Acked-by: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-05-12 13:05:04 -04:00
Konrad Rzeszutek Wilk	92bdaef7b2	Revert "xen/mmu: Add workaround "x86-64, mm: Put early page table high"" This reverts commit `a38647837a`. It does not work with certain AMD machines. last_pfn = 0x100000 max_arch_pfn = 0x400000000 initial memory mapped : 0 - 02c3a000 Base memory trampoline at [ffff88000009b000] 9b000 size 20480 init_memory_mapping: 0000000000000000-0000000100000000 0000000000 - 0100000000 page 4k kernel direct mapping tables up to 100000000 @ ff7fb000-100000000 init_memory_mapping: 0000000100000000-00000001e0800000 0100000000 - 01e0800000 page 4k kernel direct mapping tables up to 1e0800000 @ 1df0f3000-1e0000000 xen: setting RW the range fffdc000 - 100000000 RAMDISK: 0203b000 - 02c3a000 No NUMA configuration found Faking a node at 0000000000000000-00000001e0800000 NUMA: Using 63 for the hash shift. Initmem setup node 0 0000000000000000-00000001e0800000 NODE_DATA [00000001dfffb000 - 00000001dfffffff] BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff81cf6a75>] setup_node_bootmem+0x18a/0x1ea PGD 0 Oops: 0003 [#1] SMP last sysfs file: CPU 0 Modules linked in: Pid: 0, comm: swapper Not tainted 2.6.39-0-virtual #6~smb1 RIP: e030:[<ffffffff81cf6a75>] [<ffffffff81cf6a75>] setup_node_bootmem+0x18a/0x1ea RSP: e02b:ffffffff81c01e38 EFLAGS: 00010046 RAX: 0000000000000000 RBX: 00000001e0800000 RCX: 0000000000001040 RDX: 0000000000004100 RSI: 0000000000000000 RDI: ffff8801dfffb000 RBP: ffffffff81c01e58 R08: 0000000000000020 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000bfe400 FS: 0000000000000000(0000) GS:ffffffff81cca000(0000) knlGS:0000000000000000 CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c03000 CR4: 0000000000000660 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process swapper (pid: 0, threadinfo ffffffff81c00000, task ffffffff81c0b020) Stack: 0000000000000040 0000000000000001 0000000000000000 ffffffffffffffff ffffffff81c01e88 ffffffff81cf6c25 0000000000000000 0000000000000000 ffffffff81cf687f 0000000000000000 ffffffff81c01ea8 ffffffff81cf6e45 Call Trace: [<ffffffff81cf6c25>] numa_register_memblks.constprop.3+0x150/0x181 [<ffffffff81cf687f>] ? numa_add_memblk+0x7c/0x7c [<ffffffff81cf6e45>] numa_init.part.2+0x1c/0x7c [<ffffffff81cf687f>] ? numa_add_memblk+0x7c/0x7c [<ffffffff81cf6f67>] numa_init+0x6c/0x70 [<ffffffff81cf7057>] initmem_init+0x39/0x3b [<ffffffff81ce5865>] setup_arch+0x64e/0x769 [<ffffffff815e43c1>] ? printk+0x51/0x53 [<ffffffff81cdf92b>] start_kernel+0xd4/0x3f3 [<ffffffff81cdf388>] x86_64_start_reservations+0x132/0x136 [<ffffffff81ce2ed4>] xen_start_kernel+0x588/0x58f Code: 41 00 00 48 8b 3c c5 a0 24 cc 81 31 c0 40 f6 c7 01 74 05 aa 66 ba ff 40 40 f6 c7 02 74 05 66 ab 83 ea 02 89 d1 c1 e9 02 f6 c2 02 <f3> ab 74 02 66 ab 80 e2 01 74 01 aa 49 63 c4 48 c1 eb 0c 44 89 RIP [<ffffffff81cf6a75>] setup_node_bootmem+0x18a/0x1ea RSP <ffffffff81c01e38> CR2: 0000000000000000 ---[ end trace a7919e7f17c0a725 ]--- Kernel panic - not syncing: Attempted to kill the idle task! Pid: 0, comm: swapper Tainted: G D 2.6.39-0-virtual #6~smb1 Reported-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-05-12 13:04:29 -04:00
Linus Torvalds	6eaed0a438	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: fix oops in revalidate when called with NULL nameidata	2011-05-12 08:06:53 -07:00
Linus Torvalds	8043f4eb85	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6: sparc32: Fixed unaligned memory copying in function __csum_partial_copy_sparc_generic sparc32: fix sparcstation 5 boot sparc32: fix section mismatch warnings in apc, pmc and time_32	2011-05-12 07:53:34 -07:00
Linus Torvalds	75c0b3b466	Merge branch 'fixes' of master.kernel.org:/home/rmk/linux-2.6-arm * 'fixes' of master.kernel.org:/home/rmk/linux-2.6-arm: ARM: 6870/1: The mandatory barrier rmb() must be a dsb() in for device accesses ARM: 6892/1: handle ptrace requests to change PC during interrupted system calls ARM: 6890/1: memmap: only free allocated memmap entries when using SPARSEMEM ARM: zImage: the page table memory must be considered before relocation ARM: zImage: make sure not to relocate on top of the relocation code ARM: zImage: Fix bad SP address after relocating kernel ARM: zImage: make sure the stack is 64-bit aligned ARM: RiscPC: acornfb: fix section mismatches ARM: RiscPC: etherh: fix section mismatches	2011-05-12 07:53:06 -07:00
Linus Torvalds	c47747fde9	fbmem: make read/write/ioctl use the frame buffer at open time read/write/ioctl on a fbcon file descriptor has traditionally used the fbcon not when it was opened, but as it was at the time of the call. That makes no sense, but the lack of sense is much more obvious now that we properly ref-count the usage - it means that the ref-counting doesn't actually protect operations we do on the frame buffer. This changes it to look at the fb_info that we got at open time, but in order to avoid using a frame buffer long after it has been unregistered, we do verify that it is still current, and return -ENODEV if not. Acked-by: Tim Gardner <tim.gardner@canonical.com> Tested-by: Daniel J Blueman <daniel.blueman@gmail.com> Tested-by: Anca Emanuel <anca.emanuel@gmail.com> Cc: Bruno Prémont <bonbons@linux-vserver.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Dave Airlie <airlied@redhat.com> Cc: Andy Whitcroft <andy.whitcroft@canonical.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-05-12 07:46:43 -07:00
Linus Torvalds	698b368275	fbcon: add lifetime refcount to opened frame buffers This just adds the refcount and the new registration lock logic. It does not (for example) actually change the read/write/ioctl routines to actually use the frame buffer that was opened: those function still end up alway susing whatever the current frame buffer is at the time of the call. Without this, if something holds the frame buffer open over a framebuffer switch, the close() operation after the switch will access a fb_info that has been free'd by the unregistering of the old frame buffer. (The read/write/ioctl operations will normally not cause problems, because they will - illogically - pick up the new fbcon instead. But a switch that happens just as one of those is going on might see problems too, the window is just much smaller: one individual op rather than the whole open-close sequence.) This use-after-free is apparently fairly easily triggered by the Ubuntu 11.04 boot sequence. Acked-by: Tim Gardner <tim.gardner@canonical.com> Tested-by: Daniel J Blueman <daniel.blueman@gmail.com> Tested-by: Anca Emanuel <anca.emanuel@gmail.com> Cc: Bruno Prémont <bonbons@linux-vserver.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Dave Airlie <airlied@redhat.com> Cc: Andy Whitcroft <andy.whitcroft@canonical.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-05-12 07:37:51 -07:00
Catalin Marinas	a904f5f9eb	ARM: 6870/1: The mandatory barrier rmb() must be a dsb() in for device accesses Since mandatory barriers may be used (explicitly or implicitly via readl etc.) to ensure the ordering between Device and Normal memory accesses, a DMB is not enough. This patch converts it to a DSB. Cc: Colin Cross <ccross@android.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2011-05-12 10:52:00 +01:00
Arnd Bergmann	2af68df02f	ARM: 6892/1: handle ptrace requests to change PC during interrupted system calls GDB's interrupt.exp test cases currenly fail on ARM. The problem is how do_signal handled restarting interrupted system calls: The entry.S assembler code determines that we come from a system call; and that information is passed as "syscall" parameter to do_signal. That routine then calls get_signal_to_deliver [] and if a signal is to be delivered, calls into handle_signal. If a system call is to be restarted either after the signal handler returns, or if no handler is to be called in the first place, the PC is updated after the get_signal_to_deliver call, either in handle_signal (if we have a handler) or at the end of do_signal (otherwise). Now the problem is that during [], the call to get_signal_to_deliver, a ptrace intercept may happen. During this intercept, the debugger may change registers, including the PC. This is done by GDB if it wants to execute an "inferior call", i.e. the execution of some code in the debugged program triggered by GDB. To this purpose, GDB will save all registers, allocate a stack frame, set up PC and arguments as appropriate for the call, and point the link register to a dummy breakpoint instruction. Once the process is restarted, it will execute the call and then trap back to the debugger, at which point GDB will restore all registers and continue original execution. This generally works fine. However, now consider what happens when GDB attempts to do exactly that while the process was interrupted during execution of a to-be- restarted system call: do_signal is called with the syscall flag set; it calls get_signal_to_deliver, at which point the debugger takes over and changes the PC to point to a completely different place. Now get_signal_to_deliver returns without a signal to deliver; but now do_signal decides it should be restarting a system call, and decrements the PC by 2 or 4 -- so it now points to 2 or 4 bytes before the function GDB wants to call -- which leads to a subsequent crash. To fix this problem, two things need to be supported: - do_signal must be able to recognize that get_signal_to_deliver changed the PC to a different location, and skip the restart-syscall sequence - once the debugger has restored all registers at the end of the inferior call sequence, do_signal must recognize that now it needs to restart the pending system call, even though it was now entered from a breakpoint instead of an actual svc instruction This set of issues is solved on other platforms, usually by one of two mechanisms: - The status information "do_signal is handling a system call that may need restarting" is itself carried in some register that can be accessed via ptrace. This is e.g. on Intel the "orig_eax" register; on Sparc the kernel defines a magic extra bit in the flags register for this purpose. This allows GDB to manage that state: reset it when doing an inferior call, and restore it after the call is finished. - On s390, do_signal transparently handles this problem without requiring GDB interaction, by performing system call restarting in the following way: first, adjust the PC as necessary for restarting the call. Then, call get_signal_to_deliver; and finally just continue execution at the PC. This way, if GDB does not change the PC, everything is as before. If GDB does change the PC, execution will simply continue there -- and once GDB restores the PC it saved at that point, it will automatically point to the restarted system call. (There is the minor twist how to handle system calls that do not need restarting -- do_signal will undo the PC change in this case, after get_signal_to_deliver has returned, and only if ptrace did not change the PC during that call.) Because there does not appear to be any obvious register to carry the syscall-restart information on ARM, we'd either have to introduce a new artificial ptrace register just for that purpose, or else handle the issue transparently like on s390. The patch below implements the second option; using this patch makes the interrupt.exp test cases pass on ARM, with no regression in the GDB test suite otherwise. Cc: patches@linaro.org Signed-off-by: Ulrich Weigand <ulrich.weigand@linaro.org> Signed-off-by: Arnd Bergmann <arnd.bergmann@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2011-05-12 10:52:00 +01:00
Will Deacon	9af386c8dc	ARM: 6890/1: memmap: only free allocated memmap entries when using SPARSEMEM The SPARSEMEM code allocates memmap entries only for sections which are present (i.e. those which contain some valid memory). The membank checks in free_unused_memmap do not take this into account and can incorrectly attempt to free memory which is not allocated, resulting in a BUG() in the bootmem code. However, if memory is configured as follows: \|<----section---->\|<----hole---->\|<----section---->\| +--------+--------+--------------+--------+--------+ \| bank 0 \| unused \| \| bank 1 \| unused \| +--------+--------+--------------+--------+--------+ where a bank only occupies part of a section, the memmap allocated for the remainder of the section can be freed. This patch modifies the checks in free_unused_memmap so that only valid memmap entries are considered for removal. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2011-05-12 10:52:00 +01:00
Tkhai Kirill	b1054282d7	sparc32: Fixed unaligned memory copying in function __csum_partial_copy_sparc_generic When we are in the label cc_dword_align, registers %o0 and %o1 have the same last 2 bits, but it's not guaranteed one of them is zero. So we can get unaligned memory access in label ccte. Example of parameters which lead to this: %o0=0x7ff183e9, %o1=0x8e709e7d, %g1=3 With the parameters I had a memory corruption, when the additional 5 bytes were rewritten. This patch corrects the error. One comment to the patch. We don't care about the third bit in %o1, because cc_end_cruft stores word or less. Signed-off-by: Tkhai Kirill <tkhai@yandex.ru> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-05-11 21:35:04 -07:00
Linus Torvalds	3568bd9720	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: ceph: do not use i_wrbuffer_ref as refcount for Fb cap ceph: fix list_add in ceph_put_snap_realm ceph: print debug message before put mds session	2011-05-11 19:13:34 -07:00
Linus Torvalds	fad632092a	Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 * 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: drm/radeon/nouveau: fix build regression on alpha due to Xen changes. drm/radeon/kms: fix cayman acceleration drm/radeon: fix cayman struct accessors.	2011-05-11 19:13:16 -07:00
Linus Torvalds	b5121290ca	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6: mfd: Fix for the TWL4030 PM sleep/wakeup sequence mfd: Fix asic3 build error mfd: Fixed gpio polarity of omap-usb gpio USB-phy reset	2011-05-11 19:00:15 -07:00
Linus Torvalds	409ab140e2	Merge branch 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6 * 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6: [S390] fix alloc_pgste check in init_new_context [S390] oprofile: fix min/max interval query checks [S390] replace diag10() with diag10_range() function [S390] disassembler: handle b280/spp instruction [S390] kernel: Initialize register 14 when starting new CPU [S390] dasd: prevent IO error during reserve/release loop [S390] sclp/memory hotplug: fix initial usecount of increments	2011-05-11 18:59:45 -07:00
Linus Torvalds	ce8453776d	Revert "Bluetooth: fix shutdown on SCO sockets" This reverts commit `f21ca5fff6`. Quoth Gustavo F. Padovan: "Commit `f21ca5fff6` can cause a NULL dereference if we call shutdown in a bluetooth SCO socket and doesn't wait the shutdown completion to call close(). Please revert it. I may have a fix for it soon, but we don't have time anymore, so revert is the way to go. ;)" Requested-by: Gustavo F. Padovan <padovan@profusion.mobi> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-05-11 18:58:16 -07:00
Linus Torvalds	0e6f76c70e	Merge branch 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6 * 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6: PM / Hibernate: Fix ioctl SNAPSHOT_S2RAM PM / Hibernate: Make snapshot_release() restore GFP mask PM: Fix warning in pm_restrict_gfp_mask() during SNAPSHOT_S2RAM ioctl	2011-05-11 18:57:05 -07:00
Mel Gorman	1d929b7a84	mm: tracing: add missing GFP flags to tracing include/linux/gfp.h and include/trace/events/gfpflags.h are out of sync. When tracing is enabled, certain flags are not recognised and the text output is less useful as a result. Add the missing flags. Signed-off-by: Mel Gorman <mgorman@suse.de> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-05-11 18:50:45 -07:00
Hugh Dickins	59a16ead57	tmpfs: fix spurious ENOSPC when racing with unswap Testing the shmem_swaplist replacements for igrab() revealed another bug: writes to /dev/loop0 on a tmpfs file which fills its filesystem were sometimes failing with "Buffer I/O error"s. These came from ENOSPC failures of shmem_getpage(), when racing with swapoff: the same could happen when racing with another shmem_getpage(), pulling the page in from swap in between our find_lock_page() and our taking the info->lock (though not in the single-threaded loop case). This is unacceptable, and surprising that I've not noticed it before: it dates back many years, but (presumably) was made a lot easier to reproduce in 2.6.36, which sited a page preallocation in the race window. Fix it by rechecking the page cache before settling on an ENOSPC error. Signed-off-by: Hugh Dickins <hughd@google.com> Cc: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-05-11 18:50:45 -07:00
Hugh Dickins	778dd893ae	tmpfs: fix race between umount and swapoff The use of igrab() in swapoff's shmem_unuse_inode() is just as vulnerable to umount as that in shmem_writepage(). Fix this instance by extending the protection of shmem_swaplist_mutex right across shmem_unuse_inode(): while it's on the list, the inode cannot be evicted (and the filesystem cannot be unmounted) without shmem_evict_inode() taking that mutex to remove it from the list. But since shmem_writepage() might take that mutex, we should avoid making memory allocations or memcg charges while holding it: prepare them at the outer level in shmem_unuse(). When mem_cgroup_cache_charge() was originally placed, we didn't know until that point that the page from swap was actually a shmem page; but nowadays it's noted in the swap_map, so we're safe to charge upfront. For the radix_tree, do as is done in shmem_getpage(): preload upfront, but don't pin to the cpu; so we make a habit of refreshing the node pool, but might dip into GFP_NOWAIT reserves on occasion if subsequently preempted. With the allocation and charge moved out from shmem_unuse_inode(), we can also hold index map and info->lock over from finding the entry. Signed-off-by: Hugh Dickins <hughd@google.com> Cc: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-05-11 18:50:45 -07:00
Hugh Dickins	b1dea800ac	tmpfs: fix race between umount and writepage Konstanin Khlebnikov reports that a dangerous race between umount and shmem_writepage can be reproduced by this script: for i in {1..300} ; do mkdir $i while true ; do mount -t tmpfs none $i dd if=/dev/zero of=$i/test bs=1M count=$(($RANDOM % 100)) umount $i done & done on a 6xCPU node with 8Gb RAM: kernel very unstable after this accident. =) Kernel log: VFS: Busy inodes after unmount of tmpfs. Self-destruct in 5 seconds. Have a nice day... WARNING: at lib/list_debug.c:53 __list_del_entry+0x8d/0x98() list_del corruption. prev->next should be ffff880222fdaac8, but was (null) Pid: 11222, comm: mount.tmpfs Not tainted 2.6.39-rc2+ #4 Call Trace: warn_slowpath_common+0x80/0x98 warn_slowpath_fmt+0x41/0x43 __list_del_entry+0x8d/0x98 evict+0x50/0x113 iput+0x138/0x141 ... BUG: unable to handle kernel paging request at ffffffffffffffff IP: shmem_free_blocks+0x18/0x4c Pid: 10422, comm: dd Tainted: G W 2.6.39-rc2+ #4 Call Trace: shmem_recalc_inode+0x61/0x66 shmem_writepage+0xba/0x1dc pageout+0x13c/0x24c shrink_page_list+0x28e/0x4be shrink_inactive_list+0x21f/0x382 ... shmem_writepage() calls igrab() on the inode for the page which came from page reclaim, to add it later into shmem_swaplist for swapoff operation. This igrab() can race with super-block deactivating process: shrink_inactive_list() deactivate_super() pageout() tmpfs_fs_type->kill_sb() shmem_writepage() kill_litter_super() generic_shutdown_super() evict_inodes() igrab() atomic_read(&inode->i_count) skip-inode iput() if (!list_empty(&sb->s_inodes)) printk("VFS: Busy inodes after... This igrap-iput pair was added in commit `1b1b32f2c6` "tmpfs: fix shmem_swaplist races" based on incorrect assumptions: igrab() protects the inode from concurrent eviction by deletion, but it does nothing to protect it from concurrent unmounting, which goes ahead despite the raised i_count. So this use of igrab() was wrong all along, but the race made much worse in 2.6.37 when commit `63997e98a3` "split invalidate_inodes()" replaced two attempts at invalidate_inodes() by a single evict_inodes(). Konstantin posted a plausible patch, raising sb->s_active too: I'm unsure whether it was correct or not; but burnt once by igrab(), I am sure that we don't want to rely more deeply upon externals here. Fix it by adding the inode to shmem_swaplist earlier, while the page lock on page in page cache still secures the inode against eviction, without artifically raising i_count. It was originally added later because shmem_unuse_inode() is liable to remove an inode from the list while it's unswapped; but we can guard against that by taking spinlock before dropping mutex. Reported-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Signed-off-by: Hugh Dickins <hughd@google.com> Tested-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-05-11 18:50:45 -07:00
Andi Kleen	21a3c96468	memcg: allocate memory cgroup structures in local nodes Commit `dde79e005a` ("page_cgroup: reduce allocation overhead for page_cgroup array for CONFIG_SPARSEMEM") added a regression that the memory cgroup data structures all end up in node 0 because the first attempt at allocating them would not pass in a node hint. Since the initialization runs on CPU #0 it would all end up node 0. This is a problem on large memory systems, where node 0 would lose a lot of memory. Change the alloc_pages_exact() to alloc_pages_exact_nid(). This will still fall back to other nodes if not enough memory is available. [ RED-PEN: right now it would fall back first before trying vmalloc_node. Probably not the best strategy ... But I left it like that for now. ] Signed-off-by: Andi Kleen <ak@linux.intel.com> Reported-by: Doug Nelson Cc: David Rientjes <rientjes@google.com> Reviewed-by: Michal Hocko <mhocko@suse.cz> Cc: Dave Hansen <dave@linux.vnet.ibm.com> Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-05-11 18:50:45 -07:00
Andi Kleen	ee85c2e145	mm: add alloc_pages_exact_nid() Add a alloc_pages_exact_nid() that allocates on a specific node. The naming is quite broken, but fixing that would need a larger renaming action. [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: tweak comment] Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Dave Hansen <dave@linux.vnet.ibm.com> Cc: David Rientjes <rientjes@google.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-05-11 18:50:45 -07:00
Harry Wei	71a6d0af5b	MAINTAINERS: fix sorting Take alphabetical orders for MAINTAINERS file. Signed-off-by: Harry Wei <harryxiyou@gmail.com> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-05-11 18:50:45 -07:00
Yinghai Lu	8f389a99b6	mm: use alloc_bootmem_node_nopanic() on really needed path Stefan found nobootmem does not work on his system that has only 8M of RAM. This causes an early panic: BIOS-provided physical RAM map: BIOS-88: 0000000000000000 - 000000000009f000 (usable) BIOS-88: 0000000000100000 - 0000000000840000 (usable) bootconsole [earlyser0] enabled Notice: NX (Execute Disable) protection missing in CPU or disabled in BIOS! DMI not present or invalid. last_pfn = 0x840 max_arch_pfn = 0x100000 init_memory_mapping: 0000000000000000-0000000000840000 8MB LOWMEM available. mapped low ram: 0 - 00840000 low ram: 0 - 00840000 Zone PFN ranges: DMA 0x00000001 -> 0x00001000 Normal empty Movable zone start PFN for each node early_node_map[2] active PFN ranges 0: 0x00000001 -> 0x0000009f 0: 0x00000100 -> 0x00000840 BUG: Int 6: CR2 (null) EDI c034663c ESI (null) EBP c0329f38 ESP c0329ef4 EBX c0346380 EDX 00000006 ECX ffffffff EAX fffffff4 err (null) EIP c0353191 CS c0320060 flg 00010082 Stack: (null) c030c533 000007cd (null) c030c533 00000001 (null) (null) 00000003 0000083f 00000018 00000002 00000002 c0329f6c c03534d6 (null) (null) 00000100 00000840 (null) c0329f64 00000001 00001000 (null) Pid: 0, comm: swapper Not tainted 2.6.36 #5 Call Trace: [<c02e3707>] ? 0xc02e3707 [<c035e6e5>] 0xc035e6e5 [<c0353191>] ? 0xc0353191 [<c03534d6>] 0xc03534d6 [<c034f1cd>] 0xc034f1cd [<c034a824>] 0xc034a824 [<c03513cb>] ? 0xc03513cb [<c0349432>] 0xc0349432 [<c0349066>] 0xc0349066 It turns out that we should ignore the low limit of 16M. Use alloc_bootmem_node_nopanic() in this case. [akpm@linux-foundation.org: less mess] Signed-off-by: Yinghai LU <yinghai@kernel.org> Reported-by: Stefan Hellermann <stefan@the2masters.de> Tested-by: Stefan Hellermann <stefan@the2masters.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@kernel.org> [2.6.34+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-05-11 18:50:44 -07:00
Minchan Kim	bad49d9c89	mm: check PageUnevictable in lru_deactivate_fn() The lru_deactivate_fn should not move page which in on unevictable lru into inactive list. Otherwise, we can meet BUG when we use isolate_lru_pages as __isolate_lru_page could return -EINVAL. Reported-by: Ying Han <yinghan@google.com> Tested-by: Ying Han <yinghan@google.com> Signed-off-by: Minchan Kim <minchan.kim@gmail.com> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Reviewed-by: Rik van Riel<riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-05-11 18:50:44 -07:00
Ben Dooks	52cd4e5c62	drivers/rtc/rtc-s3c.c: fixup wake support for rtc The driver is not balancing set_irq and disable_irq_wake() calls, so ensure that it keeps track of whether the wake is enabled. The fixes the following error on S3C6410 devices: WARNING: at kernel/irq/manage.c:382 set_irq_wake+0x84/0xec() Unbalanced IRQ 92 wake disable Signed-off-by: Ben Dooks <ben-linux@fluff.org> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-05-11 18:50:44 -07:00
Rafael J. Wysocki	36cb7035ea	PM / Hibernate: Fix ioctl SNAPSHOT_S2RAM The SNAPSHOT_S2RAM ioctl used for implementing the feature allowing one to suspend to RAM after creating a hibernation image is currently broken, because it doesn't clear the "ready" flag in the struct snapshot_data object handled by it. As a result, the SNAPSHOT_UNFREEZE doesn't work correctly after SNAPSHOT_S2RAM has returned and the user space hibernate task cannot thaw the other processes as appropriate. Make SNAPSHOT_S2RAM clear data->ready to fix this problem. Tested-by: Alexandre Felipe Muller de Souza <alexandrefm@mandriva.com.br> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: stable@kernel.org	2011-05-11 21:10:58 +02:00
Rafael J. Wysocki	9744997a8a	PM / Hibernate: Make snapshot_release() restore GFP mask If the process using the hibernate user space interface closes /dev/snapshot after creating a hibernation image without thawing tasks, snapshot_release() should call pm_restore_gfp_mask() to restore the GFP mask used before the creation of the image. Make that happen. Tested-by: Alexandre Felipe Muller de Souza <alexandrefm@mandriva.com.br> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: stable@kernel.org	2011-05-11 21:10:43 +02:00
Rafael J. Wysocki	87186475a4	PM: Fix warning in pm_restrict_gfp_mask() during SNAPSHOT_S2RAM ioctl A warning is printed by pm_restrict_gfp_mask() while the SNAPSHOT_S2RAM ioctl is being executed after creating a hibernation image, because pm_restrict_gfp_mask() has been called once already before the image creation and suspend_devices_and_enter() calls it once again. This happens after commit `452aa6999e` (mm/pm: force GFP_NOIO during suspend/hibernation and resume). To avoid this issue, move pm_restrict_gfp_mask() and pm_restore_gfp_mask() from suspend_devices_and_enter() to its caller in kernel/power/suspend.c. Reported-by: Alexandre Felipe Muller de Souza <alexandrefm@mandriva.com.br> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: stable@kernel.org	2011-05-11 21:10:14 +02:00
Henry C Chang	d3d0720d4a	ceph: do not use i_wrbuffer_ref as refcount for Fb cap We increments i_wrbuffer_ref when taking the Fb cap. This breaks the dirty page accounting and causes looping in __ceph_do_pending_vmtruncate, and ceph client hangs. This bug can be reproduced occasionally by running blogbench. Add a new field i_wb_ref to inode and dedicate it to Fb reference counting. Signed-off-by: Henry C Chang <henry.cy.chang@gmail.com> Signed-off-by: Sage Weil <sage@newdream.net>	2011-05-11 10:44:48 -07:00

1 2 3 4 5 ...

245073 commits