linux-hardened/block
Peter Zijlstra eabe06595d block/mq: Cure cpu hotplug lock inversion
By poking at /debug/sched_features I triggered the following splat:

 [] ======================================================
 [] WARNING: possible circular locking dependency detected
 [] 4.11.0-00873-g964c8b7-dirty #694 Not tainted
 [] ------------------------------------------------------
 [] bash/2109 is trying to acquire lock:
 []  (cpu_hotplug_lock.rw_sem){++++++}, at: [<ffffffff8120cb8b>] static_key_slow_dec+0x1b/0x50
 []
 [] but task is already holding lock:
 []  (&sb->s_type->i_mutex_key#4){+++++.}, at: [<ffffffff81140216>] sched_feat_write+0x86/0x170
 []
 [] which lock already depends on the new lock.
 []
 []
 [] the existing dependency chain (in reverse order) is:
 []
 [] -> #2 (&sb->s_type->i_mutex_key#4){+++++.}:
 []        lock_acquire+0x100/0x210
 []        down_write+0x28/0x60
 []        start_creating+0x5e/0xf0
 []        debugfs_create_dir+0x13/0x110
 []        blk_mq_debugfs_register+0x21/0x70
 []        blk_mq_register_dev+0x64/0xd0
 []        blk_register_queue+0x6a/0x170
 []        device_add_disk+0x22d/0x440
 []        loop_add+0x1f3/0x280
 []        loop_init+0x104/0x142
 []        do_one_initcall+0x43/0x180
 []        kernel_init_freeable+0x1de/0x266
 []        kernel_init+0xe/0x100
 []        ret_from_fork+0x31/0x40
 []
 [] -> #1 (all_q_mutex){+.+.+.}:
 []        lock_acquire+0x100/0x210
 []        __mutex_lock+0x6c/0x960
 []        mutex_lock_nested+0x1b/0x20
 []        blk_mq_init_allocated_queue+0x37c/0x4e0
 []        blk_mq_init_queue+0x3a/0x60
 []        loop_add+0xe5/0x280
 []        loop_init+0x104/0x142
 []        do_one_initcall+0x43/0x180
 []        kernel_init_freeable+0x1de/0x266
 []        kernel_init+0xe/0x100
 []        ret_from_fork+0x31/0x40

 []  *** DEADLOCK ***
 []
 [] 3 locks held by bash/2109:
 []  #0:  (sb_writers#11){.+.+.+}, at: [<ffffffff81292bcd>] vfs_write+0x17d/0x1a0
 []  #1:  (debugfs_srcu){......}, at: [<ffffffff8155a90d>] full_proxy_write+0x5d/0xd0
 []  #2:  (&sb->s_type->i_mutex_key#4){+++++.}, at: [<ffffffff81140216>] sched_feat_write+0x86/0x170
 []
 [] stack backtrace:
 [] CPU: 9 PID: 2109 Comm: bash Not tainted 4.11.0-00873-g964c8b7-dirty #694
 [] Hardware name: Intel Corporation S2600GZ/S2600GZ, BIOS SE5C600.86B.02.02.0002.122320131210 12/23/2013
 [] Call Trace:

 []  lock_acquire+0x100/0x210
 []  get_online_cpus+0x2a/0x90
 []  static_key_slow_dec+0x1b/0x50
 []  static_key_disable+0x20/0x30
 []  sched_feat_write+0x131/0x170
 []  full_proxy_write+0x97/0xd0
 []  __vfs_write+0x28/0x120
 []  vfs_write+0xb5/0x1a0
 []  SyS_write+0x49/0xa0
 []  entry_SYSCALL_64_fastpath+0x23/0xc2

This is because of the cpu hotplug lock rework. Break the chain at #1
by reversing the lock acquisition order. This way i_mutex_key#4 no
longer depends on cpu_hotplug_lock and things are good.

Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
2017-05-04 07:57:13 -06:00
..
partitions partitions/efi: Fix integer overflow in GPT size calculation 2017-01-17 09:02:31 -07:00
badblocks.c badblocks: badblocks_set/clear update unacked_exist 2016-10-21 15:45:47 -06:00
bfq-cgroup.c block, bfq: split bfq-iosched.c into multiple source files 2017-04-19 08:48:24 -06:00
bfq-iosched.c block, bfq: don't dereference bic before null checking it 2017-04-20 08:19:23 -06:00
bfq-iosched.h bfq: fix compile error if CONFIG_CGROUPS=n 2017-04-20 09:39:12 -06:00
bfq-wf2q.c block, bfq: split bfq-iosched.c into multiple source files 2017-04-19 08:48:24 -06:00
bio-integrity.c block: remove bio_is_rw 2016-10-28 08:45:17 -06:00
bio.c block: trace completion of all bios. 2017-04-07 09:40:52 -06:00
blk-cgroup.c blkcg: allocate struct blkcg_gq outside request queue spinlock 2017-03-29 11:27:19 -06:00
blk-core.c blk-mq: unify hctx delay_work and run_work 2017-04-28 08:11:43 -06:00
blk-exec.c block: remove the errors field from struct request 2017-04-20 12:16:10 -06:00
blk-flush.c block: make __blk_end_bidi_request private 2017-04-19 10:19:47 -06:00
blk-integrity.c block: fix blk_integrity_register to use template's interval_exp if not 0 2017-04-23 12:59:56 -06:00
blk-ioc.c Merge branch 'for-linus' of git://git.kernel.dk/linux-block 2017-03-03 10:53:35 -08:00
blk-lib.c block: remove the discard_zeroes_data flag 2017-04-08 11:25:38 -06:00
blk-map.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task_stack.h> 2017-03-02 08:42:36 +01:00
blk-merge.c block: implement splitting of REQ_OP_WRITE_ZEROES bios 2017-04-08 11:25:38 -06:00
blk-mq-cpumap.c blk-mq: export blk_mq_map_queues 2016-11-08 17:30:00 -05:00
blk-mq-debugfs.c blk-mq: Add blk_mq_ops.show_rq() 2017-04-26 15:09:04 -06:00
blk-mq-pci.c blk-mq-pci: Fix two spelling mistakes 2017-03-29 11:09:51 -06:00
blk-mq-sched.c blk-mq-sched: remove hack that bypasses scheduler for reserved requests 2017-05-02 07:52:08 -06:00
blk-mq-sched.h blk-mq: Remove blk_mq_sched_move_to_dispatch() 2017-04-20 17:28:30 -06:00
blk-mq-sysfs.c blk-mq: Only unregister hctxs for which registration succeeded 2017-04-26 15:09:04 -06:00
blk-mq-tag.c blk-mq: add shallow depth option for blk_mq_get_tag() 2017-04-14 14:06:54 -06:00
blk-mq-tag.h blk-mq-sched: Allocate sched reserved tags as specified in the original queue tagset 2017-03-02 08:56:04 -07:00
blk-mq-virtio.c blk-mq: provide a default queue mapping for virtio device 2017-02-27 20:54:05 +02:00
blk-mq.c block/mq: Cure cpu hotplug lock inversion 2017-05-04 07:57:13 -06:00
blk-mq.h blk-mq-debugfs: Rename functions for registering and unregistering the mq directory 2017-04-26 15:09:04 -06:00
blk-settings.c block: remove the discard_zeroes_data flag 2017-04-08 11:25:38 -06:00
blk-softirq.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/topology.h> 2017-03-02 08:42:26 +01:00
blk-stat.c blk-stat: kill blk_stat_rq_ddir() 2017-04-21 07:56:23 -06:00
blk-stat.h blk-stat: kill blk_stat_rq_ddir() 2017-04-21 07:56:23 -06:00
blk-sysfs.c blk-mq: Register <dev>/queue/mq after having registered <dev>/queue 2017-04-26 15:09:04 -06:00
blk-tag.c blk-mq-sched: add framework for MQ capable IO schedulers 2017-01-17 10:04:20 -07:00
blk-throttle.c blk-throttle: fix unused variable warning with BLK_DEV_THROTTLING_LOW=n 2017-04-20 09:41:36 -06:00
blk-timeout.c block: remove the errors field from struct request 2017-04-20 12:16:10 -06:00
blk-wbt.c blk-stat: kill blk_stat_rq_ddir() 2017-04-21 07:56:23 -06:00
blk-wbt.h block: Make writeback throttling defaults consistent for SQ devices 2017-04-19 08:49:03 -06:00
blk-zoned.c block: Rename blk_queue_zone_size and bdev_zone_size 2017-01-12 07:58:32 -07:00
blk.h block: Export blk_init_request_from_bio() 2017-04-19 17:38:30 -06:00
bounce.c Merge branch 'for-linus' of git://git.kernel.dk/linux-block 2015-09-19 18:57:09 -07:00
bsg-lib.c scsi: introduce a result field in struct scsi_request 2017-04-20 12:16:10 -06:00
bsg.c scsi: introduce a result field in struct scsi_request 2017-04-20 12:16:10 -06:00
cfq-iosched.c cfq: Disable writeback throttling by default 2017-04-05 08:15:08 -06:00
cmdline-parser.c block: remove unrelated header files and export symbol 2014-01-21 20:18:26 -08:00
compat_ioctl.c block: remove the discard_zeroes_data flag 2017-04-08 11:25:38 -06:00
deadline-iosched.c block: enumify ELEVATOR_*_MERGE 2017-02-08 13:43:06 -07:00
elevator.c block: don't call blk_mq_quiesce_queue() after queue is frozen 2017-05-02 11:33:08 -06:00
genhd.c block: hide badblocks attribute by default 2017-04-28 08:26:42 -06:00
ioctl.c block: remove the discard_zeroes_data flag 2017-04-08 11:25:38 -06:00
ioprio.c block: Optimize ioprio_best() 2017-04-19 17:38:36 -06:00
Kconfig blk-throttle: add configure option for new .low interface 2017-03-28 08:02:20 -06:00
Kconfig.iosched block, bfq: add full hierarchical scheduling and cgroups support 2017-04-19 08:30:26 -06:00
kyber-iosched.c blk-stat: convert blk-stat bucket callback to signed 2017-04-20 15:29:16 -06:00
Makefile block, bfq: split bfq-iosched.c into multiple source files 2017-04-19 08:48:24 -06:00
mq-deadline.c block: enumify ELEVATOR_*_MERGE 2017-02-08 13:43:06 -07:00
noop-iosched.c block: move existing elevator ops to union 2017-01-17 10:03:33 -07:00
opal_proto.h block/sed-opal: allocate struct opal_dev dynamically 2017-02-17 12:41:47 -07:00
partition-generic.c block: get rid of blk_integrity_revalidate() 2017-04-21 14:17:27 -06:00
scsi_ioctl.c scsi: introduce a result field in struct scsi_request 2017-04-20 12:16:10 -06:00
sed-opal.c block: sed-opal: Tone down all the pr_* to debugs 2017-04-07 14:24:16 -06:00
t10-pi.c block: constify struct blk_integrity_profile 2017-03-24 20:34:39 -06:00