linux-hardened/net
Linus Lüssing 9b4aec647a batman-adv: fix rare race conditions on interface removal
In rare cases during shutdown the following general protection fault can
happen:

  general protection fault: 0000 [#1] SMP
  Modules linked in: batman_adv(O-) [...]
  CPU: 3 PID: 1714 Comm: rmmod Tainted: G           O    4.6.0-rc6+ #1
  [...]
  Call Trace:
   [<ffffffffa0363294>] batadv_hardif_disable_interface+0x29a/0x3a6 [batman_adv]
   [<ffffffffa0373db4>] batadv_softif_destroy_netlink+0x4b/0xa4 [batman_adv]
   [<ffffffff813b52f3>] __rtnl_link_unregister+0x48/0x92
   [<ffffffff813b9240>] rtnl_link_unregister+0xc1/0xdb
   [<ffffffff8108547c>] ? bit_waitqueue+0x87/0x87
   [<ffffffffa03850d2>] batadv_exit+0x1a/0xf48 [batman_adv]
   [<ffffffff810c26f9>] SyS_delete_module+0x136/0x1b0
   [<ffffffff8144dc65>] entry_SYSCALL_64_fastpath+0x18/0xa8
   [<ffffffff8108aaca>] ? trace_hardirqs_off_caller+0x37/0xa6
  Code: 89 f7 e8 21 bd 0d e1 4d 85 e4 75 0e 31 f6 48 c7 c7 50 d7 3b a0 e8 50 16 f2 e0 49 8b 9c 24 28 01 00 00 48 85 db 0f 84 b2 00 00 00 <48> 8b 03 4d 85 ed 48 89 45 c8 74 09 4c 39 ab f8 00 00 00 75 1c
  RIP  [<ffffffffa0371852>] batadv_purge_outstanding_packets+0x1c8/0x291 [batman_adv]
   RSP <ffff88001da5fd78>
  ---[ end trace 803b9bdc6a4a952b ]---
  Kernel panic - not syncing: Fatal exception in interrupt
  Kernel Offset: disabled
  ---[ end Kernel panic - not syncing: Fatal exception in interrupt

It does not happen often, but may potentially happen when frequently
shutting down and reinitializing an interface. With some carefully
placed msleep()s/mdelay()s it can be reproduced easily.

The issue is, that on interface removal, any still running worker thread
of a forwarding packet will race with the interface purging routine to
free a forwarding packet. Temporarily giving up a spin-lock to be able
to sleep in the purging routine is not safe.

Furthermore, there is a potential general protection fault not just for
the purging side shown above, but also on the worker side: Temporarily
removing a forw_packet from the according forw_{bcast,bat}_list will make
it impossible for the purging routine to catch and cancel it.

 # How this patch tries to fix it:

With this patch we split the queue purging into three steps: Step 1),
removing forward packets from the queue of an interface and by that
claim it as our responsibility to free.

Step 2), we are either lucky to cancel a pending worker before it starts
to run. Or if it is already running, we wait and let it do its thing,
except two things:

Through the claiming in step 1) we prevent workers from a) re-arming
themselves. And b) prevent workers from freeing packets which we still
hold in the interface purging routine.

Finally, step 3, we are sure that no forwarding packets are pending or
even running anymore on the interface to remove. We can then safely free
the claimed forwarding packets.

Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2016-11-08 19:02:39 +01:00
..
6lowpan 6lowpan: ndisc: no overreact if no short address is available 2016-09-19 20:19:34 +02:00
9p IB/core: add support to create a unsafe global rkey to ib_create_pd 2016-09-23 13:47:44 -04:00
802 net: use core MTU range checking in misc drivers 2016-10-20 14:51:10 -04:00
8021q net: use core MTU range checking in core net infra 2016-10-20 14:51:09 -04:00
appletalk appletalk: use IS_ENABLED() instead of checking for built-in or module 2016-09-10 21:19:10 -07:00
atm net: remove MTU limits on a few ether_setup callers 2016-10-21 13:57:50 -04:00
ax25
batman-adv batman-adv: fix rare race conditions on interface removal 2016-11-08 19:02:39 +01:00
bluetooth net: remove MTU limits on a few ether_setup callers 2016-10-21 13:57:50 -04:00
bridge net: use core MTU range checking in core net infra 2016-10-20 14:51:09 -04:00
caif net caif: insert missing spaces in pr_* messages and unbreak multi-line strings 2016-10-28 13:47:33 -04:00
can can: only call can_stat_update with procfs 2016-06-23 11:23:49 +02:00
ceph crush: remove redundant local variable 2016-10-05 23:02:10 +02:00
core net: dev: Fix non-RCU based lower dev walker 2016-10-29 15:50:30 -04:00
dcb
dccp tcp/dccp: drop SYN packets if accept queue is full 2016-10-29 15:09:21 -04:00
decnet net: fix decnet rtnexthop parsing 2016-07-05 14:08:47 -07:00
dns_resolver
dsa net: remove MTU limits on a few ether_setup callers 2016-10-21 13:57:50 -04:00
ethernet net: deprecate eth_change_mtu, remove usage 2016-10-13 09:36:57 -04:00
hsr genetlink: mark families as __ro_after_init 2016-10-27 16:16:09 -04:00
ieee802154 genetlink: mark families as __ro_after_init 2016-10-27 16:16:09 -04:00
ipv4 tcp/dccp: drop SYN packets if accept queue is full 2016-10-29 15:09:21 -04:00
ipv6 genetlink: mark families as __ro_after_init 2016-10-27 16:16:09 -04:00
ipx
irda genetlink: mark families as __ro_after_init 2016-10-27 16:16:09 -04:00
iucv Subject: [PATCH] af_iucv: drop skbs rejected by filter 2016-10-12 01:56:04 -04:00
kcm Merge branch 'work.splice_read' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-07 15:36:58 -07:00
key
l2tp genetlink: mark families as __ro_after_init 2016-10-27 16:16:09 -04:00
l3mdev net: ipv6: Remove l3mdev_get_saddr6 2016-09-10 23:12:53 -07:00
lapb
llc llc: switch type to bool as the timeout is only tested versus 0 2016-09-17 10:05:05 -04:00
mac80211 net: use core MTU range checking in wireless drivers 2016-10-20 14:51:08 -04:00
mac802154 mac802154: use rate limited warnings for malformed frames 2016-09-19 20:19:34 +02:00
mpls lwt: Remove unused len field 2016-10-23 17:45:01 -04:00
ncsi net/ncsi: Introduce ncsi_stop_dev() 2016-10-04 02:11:51 -04:00
netfilter genetlink: mark families as __ro_after_init 2016-10-27 16:16:09 -04:00
netlabel genetlink: mark families as __ro_after_init 2016-10-27 16:16:09 -04:00
netlink genetlink: mark families as __ro_after_init 2016-10-27 16:16:09 -04:00
netrom
nfc genetlink: mark families as __ro_after_init 2016-10-27 16:16:09 -04:00
openvswitch genetlink: mark families as __ro_after_init 2016-10-27 16:16:09 -04:00
packet packet: call fanout_release, while UNREGISTERING a netdev 2016-10-06 20:50:18 -04:00
phonet net: use core MTU range checking in misc drivers 2016-10-20 14:51:10 -04:00
qrtr
rds rds: Remove duplicate prefix from rds_conn_path_error use 2016-10-17 11:07:22 -04:00
rfkill
rose rose: limit sk_filter trim to payload 2016-07-13 11:53:40 -07:00
rxrpc rxrpc: Don't request an ACK on the last DATA packet of a call's Tx phase 2016-10-06 08:11:51 +01:00
sched netlink: Add nla_memdup() to wrap kmemdup() use on nlattr 2016-10-29 14:57:42 -04:00
sctp sctp: remove the old ttl expires policy 2016-10-13 09:44:14 -04:00
strparser strparser: Propagate correct error code in strp_recv() 2016-10-12 01:51:49 -04:00
sunrpc udp: use it's own memory accounting schema 2016-10-22 17:05:05 -04:00
switchdev switchdev: Remove redundant variable 2016-10-29 14:58:33 -04:00
tipc genetlink: mark families as __ro_after_init 2016-10-27 16:16:09 -04:00
unix skb_splice_bits(): get rid of callback 2016-10-03 20:40:56 -04:00
vmw_vsock VSOCK: Don't dec ack backlog twice for rejected connections 2016-09-27 07:59:25 -04:00
wimax genetlink: mark families as __ro_after_init 2016-10-27 16:16:09 -04:00
wireless netlink: Add nla_memdup() to wrap kmemdup() use on nlattr 2016-10-29 14:57:42 -04:00
x25 net: x25: remove null checks on arrays calling_ae and called_ae 2016-09-09 18:13:30 -07:00
xfrm Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next 2016-10-28 13:26:27 -04:00
compat.c
Kconfig strparser: Stream parser for messages 2016-08-17 19:36:23 -04:00
Makefile strparser: Stream parser for messages 2016-08-17 19:36:23 -04:00
socket.c vfs: Remove {get,set,remove}xattr inode operations 2016-10-07 21:48:36 -04:00
sysctl_net.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2016-10-06 09:52:23 -07:00