linux-hardened/net
Eric Dumazet 00cfec3748 net: add a synchronize_net() in netdev_rx_handler_unregister()
commit 35d48903e9 (bonding: fix rx_handler locking) added a race
in bonding driver, reported by Steven Rostedt who did a very good
diagnosis :

<quoting Steven>

I'm currently debugging a crash in an old 3.0-rt kernel that one of our
customers is seeing. The bug happens with a stress test that loads and
unloads the bonding module in a loop (I don't know all the details as
I'm not the one that is directly interacting with the customer). But the
bug looks to be something that may still be present and possibly present
in mainline too. It will just be much harder to trigger it in mainline.

In -rt, interrupts are threads, and can schedule in and out just like
any other thread. Note, mainline now supports interrupt threads so this
may be easily reproducible in mainline as well. I don't have the ability
to tell the customer to try mainline or other kernels, so my hands are
somewhat tied to what I can do.

But according to a core dump, I tracked down that the eth irq thread
crashed in bond_handle_frame() here:

        slave = bond_slave_get_rcu(skb->dev);
        bond = slave->bond; <--- BUG

the slave returned was NULL and accessing slave->bond caused a NULL
pointer dereference.

Looking at the code that unregisters the handler:

void netdev_rx_handler_unregister(struct net_device *dev)
{

        ASSERT_RTNL();
        RCU_INIT_POINTER(dev->rx_handler, NULL);
        RCU_INIT_POINTER(dev->rx_handler_data, NULL);
}

Which is basically:
        dev->rx_handler = NULL;
        dev->rx_handler_data = NULL;

And looking at __netif_receive_skb() we have:

        rx_handler = rcu_dereference(skb->dev->rx_handler);
        if (rx_handler) {
                if (pt_prev) {
                        ret = deliver_skb(skb, pt_prev, orig_dev);
                        pt_prev = NULL;
                }
                switch (rx_handler(&skb)) {

My question to all of you is, what stops this interrupt from happening
while the bonding module is unloading?  What happens if the interrupt
triggers and we have this:

        CPU0                    CPU1
        ----                    ----
  rx_handler = skb->dev->rx_handler

                        netdev_rx_handler_unregister() {
                           dev->rx_handler = NULL;
                           dev->rx_handler_data = NULL;

  rx_handler()
   bond_handle_frame() {
    slave = skb->dev->rx_handler;
    bond = slave->bond; <-- NULL pointer dereference!!!

What protection am I missing in the bond release handler that would
prevent the above from happening?

</quoting Steven>

We can fix bug this in two ways. First is adding a test in
bond_handle_frame() and others to check if rx_handler_data is NULL.

A second way is adding a synchronize_net() in
netdev_rx_handler_unregister() to make sure that a rcu protected reader
has the guarantee to see a non NULL rx_handler_data.

The second way is better as it avoids an extra test in fast path.

Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jiri Pirko <jpirko@redhat.com>
Cc: Paul E. McKenney <paulmck@us.ibm.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-29 15:38:15 -04:00
..
9p Revert parts of "hlist: drop the node parameter from iterators" 2013-03-08 15:05:34 -08:00
802 mrp: make mrp_rcv static 2013-02-11 14:16:26 -05:00
8021q 8021q: fix a potential use-after-free 2013-03-24 17:27:28 -04:00
appletalk hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
atm hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
ax25 hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
batman-adv batman-adv: verify tt len does not exceed packet len 2013-03-11 22:59:47 +01:00
bluetooth Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth 2013-03-18 15:17:11 -04:00
bridge bridge: fix crash when set mac address of br interface 2013-03-24 17:27:28 -04:00
caif CAIF: fix sparse warning for caif_usb 2013-03-04 14:12:07 -05:00
can hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
ceph Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client 2013-03-12 09:22:42 -07:00
core net: add a synchronize_net() in netdev_rx_handler_unregister() 2013-03-29 15:38:15 -04:00
dcb dcbnl: fix various netlink info leaks 2013-03-10 05:19:26 -04:00
dccp Driver core patches for 3.9-rc1 2013-02-21 12:05:51 -08:00
decnet hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
dns_resolver Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security 2012-12-16 15:40:50 -08:00
dsa dsa: make dsa_switch_setup check for valid port names 2013-01-21 15:40:12 -05:00
ethernet net: split eth_mac_addr for better error handling 2013-01-21 14:07:44 -05:00
ieee802154 6lowpan: Fix endianness issue in is_addr_link_local(). 2013-03-10 16:49:35 -04:00
ipv4 ipv4: Fix ip-header identification for gso packets. 2013-03-26 13:50:05 -04:00
ipv6 ipv6: don't accept node local multicast traffic from the wire 2013-03-29 14:57:33 -04:00
ipx hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
irda net/irda: add missing error path release_sock call 2013-03-20 12:23:13 -04:00
iucv hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
key Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec 2013-03-27 14:07:04 -04:00
l2tp l2tp: unhash l2tp sessions on delete, not on free 2013-03-20 12:10:39 -04:00
lapb net/lapb: remove depends on CONFIG_EXPERIMENTAL 2013-01-11 11:40:01 -08:00
llc hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
mac80211 Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 2013-03-25 14:50:17 -04:00
mac802154 Driver core patches for 3.9-rc1 2013-02-21 12:05:51 -08:00
netfilter ipvs: remove extra rcu lock 2013-03-19 21:21:52 +09:00
netlabel netlabel: fix build problems when CONFIG_IPV6=n 2013-03-08 11:33:51 -05:00
netlink genetlink: trigger BUG_ON if a group name is too long 2013-03-20 12:05:51 -04:00
netrom hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
nfc NFC: llcp: Report error to pending sockets when a device is removed 2013-03-08 17:35:22 +01:00
openvswitch Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch 2013-03-15 09:00:39 -04:00
packet hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
phonet hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
rds net/rds: zero last byte for strncpy 2013-03-08 00:35:44 -05:00
rfkill rfkill: don't use [delayed_]work_pending() 2012-12-28 13:40:16 -08:00
rose hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
rxrpc Driver core patches for 3.9-rc1 2013-02-21 12:05:51 -08:00
sched net: fq_codel: Fix off-by-one error 2013-03-29 15:32:23 -04:00
sctp sctp: don't break the loop while meeting the active_path so as to find the matched transport 2013-03-13 10:09:55 -04:00
sunrpc NFS client bugfixes for Linux 3.9 2013-03-26 14:23:45 -07:00
tipc hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
unix af_unix: dont send SCM_CREDENTIAL when dest socket is NULL 2013-03-26 12:33:55 -04:00
vmw_vsock VSOCK: Don't reject PF_VSOCK protocol 2013-02-18 15:02:51 -05:00
wimax
wireless Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 2013-03-25 14:50:17 -04:00
x25 hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
xfrm Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec 2013-03-27 14:07:04 -04:00
compat.c
Kconfig Driver core patches for 3.9-rc1 2013-02-21 12:05:51 -08:00
Makefile VSOCK: Introduce VM Sockets 2013-02-10 19:41:08 -05:00
nonet.c
socket.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-02-26 20:16:07 -08:00
sysctl_net.c user_ns: get rid of duplicate code in net_ctl_permissions 2012-11-18 20:32:45 -05:00