Commit graph

601851 commits

Author SHA1 Message Date
Shweta Choudaha
0a46baaf63 ip6gre: Allow live link address change
The ip6 GRE tap device should not be forced to down state to change
the mac address and should allow live address change for tap device
similar to ipv4 gre.

Signed-off-by: Shweta Choudaha <schoudah@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-08 22:02:17 -07:00
David S. Miller
a436d20df9 Merge branch 'cls_u32-hwoffload-fixes'
Jakub Kicinski says:

====================
incremental cls_u32 hardware offload fixes

These are incremental changes from v1 of cls_u32 fixes.
First patch is reposted in its entirety, patch 2 is an
incremental change from patch 2 of the original series.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-08 21:43:15 -07:00
Jakub Kicinski
201c44bd8f net: cls_u32: be more strict about skip-sw flag for knodes
Return an error if user requested skip-sw and the underlaying
hardware cannot handle tc offloads (or offloads are disabled).
This patch fixes the knode handling.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-08 21:43:14 -07:00
Jakub Kicinski
6eef3801e7 net: cls_u32: catch all hardware offload errors
Errors reported by u32_replace_hw_hnode() were not propagated.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-08 21:43:14 -07:00
Bert Kenward
3497ed8c85 sfc: report supported link speeds on SFP connections
7000-series SFC NICs connected with an SFP+ module currently fail to
report any supported link speeds.

Reported-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: Bert Kenward <bkenward@solarflare.com>
Reviewed-by: Jarod Wilson <jarod@redhat.com>
Tested-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-08 11:18:45 -07:00
Eric Dumazet
e0d194adfa net_sched: add missing paddattr description
"make htmldocs" complains otherwise:

.//net/core/gen_stats.c:65: warning: No description found for parameter 'padattr'
.//net/core/gen_stats.c:101: warning: No description found for parameter 'padattr'

Fixes: 9854518ea0 ("sched: align nlattr properly when needed")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-08 11:17:39 -07:00
Jakub Sitnicki
00bc0ef588 ipv6: Skip XFRM lookup if dst_entry in socket cache is valid
At present we perform an xfrm_lookup() for each UDPv6 message we
send. The lookup involves querying the flow cache (flow_cache_lookup)
and, in case of a cache miss, creating an XFRM bundle.

If we miss the flow cache, we can end up creating a new bundle and
deriving the path MTU (xfrm_init_pmtu) from on an already transformed
dst_entry, which we pass from the socket cache (sk->sk_dst_cache) down
to xfrm_lookup(). This can happen only if we're caching the dst_entry
in the socket, that is when we're using a connected UDP socket.

To put it another way, the path MTU shrinks each time we miss the flow
cache, which later on leads to incorrectly fragmented payload. It can
be observed with ESPv6 in transport mode:

  1) Set up a transformation and lower the MTU to trigger fragmentation
    # ip xfrm policy add dir out src ::1 dst ::1 \
      tmpl src ::1 dst ::1 proto esp spi 1
    # ip xfrm state add src ::1 dst ::1 \
      proto esp spi 1 enc 'aes' 0x0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b
    # ip link set dev lo mtu 1500

  2) Monitor the packet flow and set up an UDP sink
    # tcpdump -ni lo -ttt &
    # socat udp6-listen:12345,fork /dev/null &

  3) Send a datagram that needs fragmentation with a connected socket
    # perl -e 'print "@" x 1470 | socat - udp6:[::1]:12345
    2016/06/07 18:52:52 socat[724] E read(3, 0x555bb3d5ba00, 8192): Protocol error
    00:00:00.000000 IP6 ::1 > ::1: frag (0|1448) ESP(spi=0x00000001,seq=0x2), length 1448
    00:00:00.000014 IP6 ::1 > ::1: frag (1448|32)
    00:00:00.000050 IP6 ::1 > ::1: ESP(spi=0x00000001,seq=0x3), length 1272
    (^ ICMPv6 Parameter Problem)
    00:00:00.000022 IP6 ::1 > ::1: ESP(spi=0x00000001,seq=0x5), length 136

  4) Compare it to a non-connected socket
    # perl -e 'print "@" x 1500' | socat - udp6-sendto:[::1]:12345
    00:00:40.535488 IP6 ::1 > ::1: frag (0|1448) ESP(spi=0x00000001,seq=0x6), length 1448
    00:00:00.000010 IP6 ::1 > ::1: frag (1448|64)

What happens in step (3) is:

  1) when connecting the socket in __ip6_datagram_connect(), we
     perform an XFRM lookup, miss the flow cache, create an XFRM
     bundle, and cache the destination,

  2) afterwards, when sending the datagram, we perform an XFRM lookup,
     again, miss the flow cache (due to mismatch of flowi6_iif and
     flowi6_oif, which is an issue of its own), and recreate an XFRM
     bundle based on the cached (and already transformed) destination.

To prevent the recreation of an XFRM bundle, avoid an XFRM lookup
altogether whenever we already have a destination entry cached in the
socket. This prevents the path MTU shrinkage and brings us on par with
UDPv4.

The fix also benefits connected PINGv6 sockets, another user of
ip6_sk_dst_lookup_flow(), who also suffer messages being transformed
twice.

Joint work with Hannes Frederic Sowa.

Reported-by: Jan Tluka <jtluka@redhat.com>
Signed-off-by: Jakub Sitnicki <jkbs@redhat.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-08 11:16:06 -07:00
Guillaume Nault
a5c5e2da85 l2tp: fix configuration passed to setup_udp_tunnel_sock()
Unused fields of udp_cfg must be all zeros. Otherwise
setup_udp_tunnel_sock() fills ->gro_receive and ->gro_complete
callbacks with garbage, eventually resulting in panic when used by
udp_gro_receive().

[   72.694123] BUG: unable to handle kernel paging request at ffff880033f87d78
[   72.695518] IP: [<ffff880033f87d78>] 0xffff880033f87d78
[   72.696530] PGD 26e2067 PUD 26e3067 PMD 342ed063 PTE 8000000033f87163
[   72.696530] Oops: 0011 [#1] SMP KASAN
[   72.696530] Modules linked in: l2tp_ppp l2tp_netlink l2tp_core ip6_udp_tunnel udp_tunnel pptp gre pppox ppp_generic slhc crc32c_intel ghash_clmulni_intel jitterentropy_rng sha256_generic hmac drbg ansi_cprng aesni_intel evdev aes_x86_64 ablk_helper cryptd lrw gf128mul glue_helper serio_raw acpi_cpufreq button proc\
essor ext4 crc16 jbd2 mbcache virtio_blk virtio_net virtio_pci virtio_ring virtio
[   72.696530] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 4.7.0-rc1 #1
[   72.696530] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Debian-1.8.2-1 04/01/2014
[   72.696530] task: ffff880035b59700 ti: ffff880035b70000 task.ti: ffff880035b70000
[   72.696530] RIP: 0010:[<ffff880033f87d78>]  [<ffff880033f87d78>] 0xffff880033f87d78
[   72.696530] RSP: 0018:ffff880035f87bc0  EFLAGS: 00010246
[   72.696530] RAX: ffffed000698f996 RBX: ffff88003326b840 RCX: ffffffff814cc823
[   72.696530] RDX: ffff88003326b840 RSI: ffff880033e48038 RDI: ffff880034c7c780
[   72.696530] RBP: ffff880035f87c18 R08: 000000000000a506 R09: 0000000000000000
[   72.696530] R10: ffff880035f87b38 R11: ffff880034b9344d R12: 00000000ebfea715
[   72.696530] R13: 0000000000000000 R14: ffff880034c7c780 R15: 0000000000000000
[   72.696530] FS:  0000000000000000(0000) GS:ffff880035f80000(0000) knlGS:0000000000000000
[   72.696530] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   72.696530] CR2: ffff880033f87d78 CR3: 0000000033c98000 CR4: 00000000000406a0
[   72.696530] Stack:
[   72.696530]  ffffffff814cc834 ffff880034b93468 0000001481416818 ffff88003326b874
[   72.696530]  ffff880034c7ccb0 ffff880033e48038 ffff88003326b840 ffff880034b93462
[   72.696530]  ffff88003326b88a ffff88003326b88c ffff880034b93468 ffff880035f87c70
[   72.696530] Call Trace:
[   72.696530]  <IRQ>
[   72.696530]  [<ffffffff814cc834>] ? udp_gro_receive+0x1c6/0x1f9
[   72.696530]  [<ffffffff814ccb1c>] udp4_gro_receive+0x2b5/0x310
[   72.696530]  [<ffffffff814d989b>] inet_gro_receive+0x4a3/0x4cd
[   72.696530]  [<ffffffff81431b32>] dev_gro_receive+0x584/0x7a3
[   72.696530]  [<ffffffff810adf7a>] ? __lock_is_held+0x29/0x64
[   72.696530]  [<ffffffff814321f7>] napi_gro_receive+0x124/0x21d
[   72.696530]  [<ffffffffa000b145>] virtnet_receive+0x8df/0x8f6 [virtio_net]
[   72.696530]  [<ffffffffa000b27e>] virtnet_poll+0x1d/0x8d [virtio_net]
[   72.696530]  [<ffffffff81431350>] net_rx_action+0x15b/0x3b9
[   72.696530]  [<ffffffff815893d6>] __do_softirq+0x216/0x546
[   72.696530]  [<ffffffff81062392>] irq_exit+0x49/0xb6
[   72.696530]  [<ffffffff81588e9a>] do_IRQ+0xe2/0xfa
[   72.696530]  [<ffffffff81587a49>] common_interrupt+0x89/0x89
[   72.696530]  <EOI>
[   72.696530]  [<ffffffff810b05df>] ? trace_hardirqs_on_caller+0x229/0x270
[   72.696530]  [<ffffffff8102b3c7>] ? default_idle+0x1c/0x2d
[   72.696530]  [<ffffffff8102b3c5>] ? default_idle+0x1a/0x2d
[   72.696530]  [<ffffffff8102bb8c>] arch_cpu_idle+0xa/0xc
[   72.696530]  [<ffffffff810a6c39>] default_idle_call+0x1a/0x1c
[   72.696530]  [<ffffffff810a6d96>] cpu_startup_entry+0x15b/0x20f
[   72.696530]  [<ffffffff81039a81>] start_secondary+0x12c/0x133
[   72.696530] Code: ff ff ff ff ff ff ff ff ff ff 7f ff ff ff ff ff ff ff 7f 00 7e f8 33 00 88 ff ff 6d 61 58 81 ff ff ff ff 5e de 0a 81 ff ff ff ff <00> 5c e2 34 00 88 ff ff 00 00 00 00 00 00 00 00 00 00 00 00 00
[   72.696530] RIP  [<ffff880033f87d78>] 0xffff880033f87d78
[   72.696530]  RSP <ffff880035f87bc0>
[   72.696530] CR2: ffff880033f87d78
[   72.696530] ---[ end trace ad7758b9a1dccf99 ]---
[   72.696530] Kernel panic - not syncing: Fatal exception in interrupt
[   72.696530] Kernel Offset: disabled
[   72.696530] ---[ end Kernel panic - not syncing: Fatal exception in interrupt

v2: use empty initialiser instead of "{ NULL }" to avoid relying on
    first field's type.

Fixes: 38fd2af24f ("udp: Add socket based GRO and config")
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-08 11:11:53 -07:00
Hariprasad Shenai
c0530dd3ef cxgb4: Add device id of T540-BT adapter
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-08 10:23:46 -07:00
Ben Dooks
88832a22d6 net-sysfs: fix missing <linux/of_net.h>
The of_find_net_device_by_node() function is defined in
<linux/of_net.h> but not included in the .c file that
implements it. Fix the following warning by including the
header:

net/core/net-sysfs.c:1494:19: warning: symbol 'of_find_net_device_by_node' was not declared. Should it be static?

Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-08 00:37:58 -07:00
Toshiaki Makita
0b148def40 bridge: Don't insert unnecessary local fdb entry on changing mac address
The missing br_vlan_should_use() test caused creation of an unneeded
local fdb entry on changing mac address of a bridge device when there is
a vlan which is configured on a bridge port but not on the bridge
device.

Fixes: 2594e9064a ("bridge: vlan: add per-vlan struct and move to rhashtables")
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-08 00:31:38 -07:00
David S. Miller
3256564458 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter/IPVS fixes for net

The following patchset contains two Netfilter/IPVS fixes for your net
tree, they are:

1) Fix missing alignment in next offset calculation for standard
   targets, introduced in the previous merge window, patch from
   Florian Westphal.

2) Fix to correct the handling of outgoing connections which use the
   SIP-pe such that the binding of a real-server is updated when needed.
   This was an omission from changes introduced by Marco Angaroni in
   the previous merge window too, to allow handling of outgoing
   connections by the SIP-pe. Patch and report came via Simon Horman.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-07 17:14:10 -07:00
Yuchung Cheng
ce3cf4ec03 tcp: record TLP and ER timer stats in v6 stats
The v6 tcp stats scan do not provide TLP and ER timer information
correctly like the v4 version . This patch fixes that.

Fixes: 6ba8a3b19e ("tcp: Tail loss probe (TLP)")
Fixes: eed530b6c6 ("tcp: early retransmit")
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-07 17:12:22 -07:00
Daniel Borkmann
92c075dbde net: sched: fix tc_should_offload for specific clsact classes
When offloading classifiers such as u32 or flower to hardware, and the
qdisc is clsact (TC_H_CLSACT), then we need to differentiate its classes,
since not all of them handle ingress, therefore we must leave those in
software path. Add a .tcf_cl_offload() callback, so we can generically
handle them, tested on ixgbe.

Fixes: 10cbc68434 ("net/sched: cls_flower: Hardware offloaded filters statistics support")
Fixes: 5b33f48842 ("net/flower: Introduce hardware offload support")
Fixes: a1b7c5fd7f ("net: sched: add cls_u32 offload hooks for netdevs")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-07 16:59:53 -07:00
WANG Cong
a03e6fe569 act_police: fix a crash during removal
The police action is using its own code to initialize tcf hash
info, which makes us to forgot to initialize a->hinfo correctly.
Fix this by calling the helper function tcf_hash_create() directly.

This patch fixed the following crash:

 BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
 IP: [<ffffffff810c099f>] __lock_acquire+0xd3/0xf91
 PGD d3c34067 PUD d3e18067 PMD 0
 Oops: 0000 [#1] SMP
 CPU: 2 PID: 853 Comm: tc Not tainted 4.6.0+ #87
 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
 task: ffff8800d3e28040 ti: ffff8800d3f6c000 task.ti: ffff8800d3f6c000
 RIP: 0010:[<ffffffff810c099f>]  [<ffffffff810c099f>] __lock_acquire+0xd3/0xf91
 RSP: 0000:ffff88011b203c80  EFLAGS: 00010002
 RAX: 0000000000000046 RBX: 0000000000000000 RCX: 0000000000000000
 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000028
 RBP: ffff88011b203d40 R08: 0000000000000001 R09: 0000000000000000
 R10: ffff88011b203d58 R11: ffff88011b208000 R12: 0000000000000001
 R13: ffff8800d3e28040 R14: 0000000000000028 R15: 0000000000000000
 FS:  0000000000000000(0000) GS:ffff88011b200000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000028 CR3: 00000000d4be1000 CR4: 00000000000006e0
 Stack:
  ffff8800d3e289c0 0000000000000046 000000001b203d60 ffffffff00000000
  0000000000000000 ffff880000000000 0000000000000000 ffffffff00000000
  ffffffff8187142c ffff88011b203ce8 ffff88011b203ce8 ffffffff8101dbfc
 Call Trace:
  <IRQ>
  [<ffffffff8187142c>] ? __tcf_hash_release+0x77/0xd1
  [<ffffffff8101dbfc>] ? native_sched_clock+0x1a/0x35
  [<ffffffff8101dbfc>] ? native_sched_clock+0x1a/0x35
  [<ffffffff810a9604>] ? sched_clock_local+0x11/0x78
  [<ffffffff810bf6a1>] ? mark_lock+0x24/0x201
  [<ffffffff810c1dbd>] lock_acquire+0x120/0x1b4
  [<ffffffff810c1dbd>] ? lock_acquire+0x120/0x1b4
  [<ffffffff8187142c>] ? __tcf_hash_release+0x77/0xd1
  [<ffffffff81aad89f>] _raw_spin_lock_bh+0x3c/0x72
  [<ffffffff8187142c>] ? __tcf_hash_release+0x77/0xd1
  [<ffffffff8187142c>] __tcf_hash_release+0x77/0xd1
  [<ffffffff81871a27>] tcf_action_destroy+0x49/0x7c
  [<ffffffff81870b1c>] tcf_exts_destroy+0x20/0x2d
  [<ffffffff8189273b>] u32_destroy_key+0x1b/0x4d
  [<ffffffff81892788>] u32_delete_key_freepf_rcu+0x1b/0x1d
  [<ffffffff810de3b8>] rcu_process_callbacks+0x610/0x82e
  [<ffffffff8189276d>] ? u32_destroy_key+0x4d/0x4d
  [<ffffffff81ab0bc1>] __do_softirq+0x191/0x3f4

Fixes: ddf97ccdd7 ("net_sched: add network namespace support for tc actions")
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-07 16:38:59 -07:00
Eric Dumazet
aafddbf0cf fq_codel: return non zero qlen in class dumps
We properly scan the flow list to count number of packets,
but John passed 0 to gnet_stats_copy_queue() so we report
a zero value to user space instead of the result.

Fixes: 6401585366 ("net: sched: restrict use of qstats qlen")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: John Fastabend <john.r.fastabend@intel.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-07 16:28:11 -07:00
David S. Miller
064d5e6f8e Merge branch 'u32-hwoffload-fixes'
Jakub Kicinski says:

====================
cls_u32 hardware offload fixes

This set fixes two small issues with error codes I noticed
in cls_u32.  Second patch could be viewed as user space API
change but that portion of API is not part of any release,
yet.

Compile tested only.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-07 16:27:15 -07:00
Jakub Kicinski
d47a0f387f net: cls_u32: be more strict about skip-sw flag
Return an error if user requested skip-sw and the underlaying
hardware cannot handle tc offloads (or offloads are disabled).

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-07 16:27:14 -07:00
Jakub Kicinski
1a0f7d2984 net: cls_u32: fix error code for invalid flags
'err' variable is not set in this test, we would return whatever
previous test set 'err' to.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-07 16:27:14 -07:00
Colin Ian King
7b01b8e847 gtp: #define _UAPI_LINUX_GTP_H_ and not _UAPI_LINUX_GTP_H__
Fix clang build warning:

./include/uapi/linux/gtp.h:1:9: warning: '_UAPI_LINUX_GTP_H_' is
used as a header guard here, followed by #define of a different
macro [-Wheader-guard]

fix by defining  _UAPI_LINUX_GTP_H_ and not _UAPI_LINUX_GTP_H__

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-07 16:25:49 -07:00
Colin Ian King
9f647a6de9 net: fec: fix spelling mistakes and add missing newline
trivial fix to spelling mistakes and add missing newline in pr_err
messages

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-07 16:15:59 -07:00
David S. Miller
71743ffa15 Merge branch 'bnxt_en-fixes'
Michael Chan says:

====================
bnxt_en: Bug fixes.

Fix a race condition and VLAN rx acceleration logic.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-07 16:02:04 -07:00
Michael Chan
8852ddb4dc bnxt_en: Simplify VLAN receive logic.
Since both CTAG and STAG rx acceleration must be enabled together, we
only need to check one feature flag (NETIF_F_HW_VLAN_CTAG_RX) before
calling __vlan_hwaccel_put_tag().

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-07 16:02:03 -07:00
Michael Chan
5a9f6b238e bnxt_en: Enable and disable RX CTAG and RX STAG VLAN acceleration together.
The hardware can only be set to strip or not strip both the VLAN CTAG and
STAG.  It cannot strip one and not strip the other.  Add logic to
bnxt_fix_features() to toggle both feature flags when the user is toggling
one of them.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-07 16:02:03 -07:00
Michael Chan
b9a8460a08 bnxt_en: Fix tx push race condition.
Set the is_push flag in the software BD before the tx data is pushed to
the chip.  It is possible to get the tx interrupt as soon as the tx data
is pushed.  The tx handler will not handle the event properly if the
is_push flag is not set and it will crash.

Signed-off-by: Michael Chan <michael.chan@broadocm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-07 16:02:03 -07:00
Wu Fengguang
fa54cc70ed rxrpc: fix ptr_ret.cocci warnings
net/rxrpc/rxkad.c:1165:1-3: WARNING: PTR_ERR_OR_ZERO can be used

 Use PTR_ERR_OR_ZERO rather than if(IS_ERR(...)) + PTR_ERR

Generated by: scripts/coccinelle/api/ptr_ret.cocci

CC: David Howells <dhowells@redhat.com>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-07 15:30:21 -07:00
David S. Miller
29a36611e9 Merge branch 'rds-packet-assembly-fixes'
Sowmini Varadhan says:

====================
RDS: TCP: socket locking RDS packet assembly fixes

This three part patchset fixes bugs in synchronization between
rds_tcp_accept_one() and the rds-tcp send/recv path.

Patch 1 ensures that the lock_sock() is taken appropriately
and the RDS datagram reassembly state is reset  to synchronize
with the receive path.

Patch 2 ensures that partially sent RDS datagrams will get
retransmitted after rds_tcp_accept_one() switches sockets.

Patch 3 fixes a race window which would prematurely re-enable
rds_send_xmit() before the rds_tcp_connection setup has been
completed in rds_tcp_accept_one().
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-07 15:10:16 -07:00
Sowmini Varadhan
9c79440e2c RDS: TCP: fix race windows in send-path quiescence by rds_tcp_accept_one()
The send path needs to be quiesced before resetting callbacks from
rds_tcp_accept_one(), and commit eb19284026 ("RDS:TCP: Synchronize
rds_tcp_accept_one with rds_send_xmit when resetting t_sock") achieves
this using the c_state and RDS_IN_XMIT bit following the pattern
used by rds_conn_shutdown(). However this leaves the possibility
of a race window as shown in the sequence below
    take t_conn_lock in rds_tcp_conn_connect
    send outgoing syn to peer
    drop t_conn_lock in rds_tcp_conn_connect
    incoming from peer triggers rds_tcp_accept_one, conn is
	marked CONNECTING
    wait for RDS_IN_XMIT to quiesce any rds_send_xmit threads
    call rds_tcp_reset_callbacks
    [.. race-window where incoming syn-ack can cause the conn
	to be marked UP from rds_tcp_state_change ..]
    lock_sock called from rds_tcp_reset_callbacks, and we set
	t_sock to null
As soon as the conn is marked UP in the race-window above, rds_send_xmit()
threads will proceed to rds_tcp_xmit and may encounter a null-pointer
deref on the t_sock.

Given that rds_tcp_state_change() is invoked in softirq context, whereas
rds_tcp_reset_callbacks() is in workq context, and testing for RDS_IN_XMIT
after lock_sock could result in a deadlock with tcp_sendmsg, this
commit fixes the race by using a new c_state, RDS_TCP_RESETTING, which
will prevent a transition to RDS_CONN_UP from rds_tcp_state_change().

Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-07 15:10:15 -07:00
Sowmini Varadhan
0b6f760cff RDS: TCP: Retransmit half-sent datagrams when switching sockets in rds_tcp_reset_callbacks
When we switch a connection's sockets in rds_tcp_rest_callbacks,
any partially sent datagram must be retransmitted on the new
socket so that the receiver can correctly reassmble the RDS
datagram. Use rds_send_reset() which is designed for this purpose.

Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-07 15:10:15 -07:00
Sowmini Varadhan
335b48d980 RDS: TCP: Add/use rds_tcp_reset_callbacks to reset tcp socket safely
When rds_tcp_accept_one() has to replace the existing tcp socket
with a newer tcp socket (duelling-syn resolution), it must lock_sock()
to suppress the rds_tcp_data_recv() path while callbacks are being
changed.  Also, existing RDS datagram reassembly state must be reset,
so that the next datagram on the new socket  does not have corrupted
state. Similarly when resetting the newly accepted socket, appropriate
locks and synchronization is needed.

This commit ensures correct synchronization by invoking
kernel_sock_shutdown to reset a newly accepted sock, and by taking
appropriate lock_sock()s (for old and new sockets) when resetting
existing callbacks.

Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-07 15:10:15 -07:00
Eric Dumazet
80e509db54 fq_codel: fix NET_XMIT_CN behavior
My prior attempt to fix the backlogs of parents failed.

If we return NET_XMIT_CN, our parents wont increase their backlog,
so our qdisc_tree_reduce_backlog() should take this into account.

v2: Florian Westphal pointed out that we could drop the packet,
so we need to save qdisc_pkt_len(skb) in a temp variable before
calling fq_codel_drop()

Fixes: 9d18562a22 ("fq_codel: add batch ability to fq_codel_drop()")
Fixes: 2ccccf5fb4 ("net_sched: update hierarchical backlog too")
Reported-by: Stas Nichiporovich <stasn77@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: WANG Cong <xiyou.wangcong@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-07 14:49:56 -07:00
Daniel Borkmann
5b6c1b4d46 bpf, trace: use READ_ONCE for retrieving file ptr
In bpf_perf_event_read() and bpf_perf_event_output(), we must use
READ_ONCE() for fetching the struct file pointer, which could get
updated concurrently, so we must prevent the compiler from potential
refetching.

We already do this with tail calls for fetching the related bpf_prog,
but not so on stored perf events. Semantics for both are the same
with regards to updates.

Fixes: a43eec3042 ("bpf: introduce bpf_perf_event_output() helper")
Fixes: 35578d7984 ("bpf: Implement function bpf_perf_event_read() that get the selected hardware PMU conuter")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-07 14:48:03 -07:00
WANG Cong
a27758ffaf net_sched: keep backlog updated with qlen
For gso_skb we only update qlen, backlog should be updated too.

Note, it is correct to just update these stats at one layer,
because the gso_skb is cached there.

Reported-by: Stas Nichiporovich <stasn77@gmail.com>
Fixes: 2ccccf5fb4 ("net_sched: update hierarchical backlog too")
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-06 21:14:29 -04:00
Helge Deller
1957598840 soreuseport: add compat case for setsockopt SO_ATTACH_REUSEPORT_CBPF
Commit 538950a1b7 ("soreuseport: setsockopt SO_ATTACH_REUSEPORT_[CE]BPF")
missed to add the compat case for the SO_ATTACH_REUSEPORT_CBPF option.

Signed-off-by: Helge Deller <deller@gmx.de>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-06 15:21:04 -07:00
Helge Deller
fc100a7f89 soreuseport: Fix reuseport_bpf testcase on 32bit architectures
This fixes the following compiler warnings when compiling the
reuseport_bpf testcase on a 32 bit platform:

reuseport_bpf.c: In function ‘attach_ebpf’:
reuseport_bpf.c:114:15: warning: cast from pointer to integer of ifferent size [-Wpointer-to-int-cast]

Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-06 15:20:16 -07:00
Michal Schmidt
a02cc9d3cc bnx2x: allow adding VLANs while interface is down
Since implementing VLAN filtering in commit 05cc5a39dd
("bnx2x: add vlan filtering offload") bnx2x refuses to add a VLAN while
the interface is down:

  # ip link add link enp3s0f0 enp3s0f0_10 type vlan id 10
  RTNETLINK answers: Bad address

and in dmesg (with bnx2x.debug=0x20):
  bnx2x: [bnx2x_vlan_rx_add_vid:12941(enp3s0f0)]Ignoring VLAN
  configuration the interface is down

Other drivers have no problem with this.
Fix this peculiar behavior in the following way:
 - Accept requests to add/kill VID regardless of the device state.
   Maintain the requested list of VIDs in the bp->vlan_reg list.
 - If the device is up, try to configure the VID list into the hardware.
   If we run out of VLAN credits or encounter a failure configuring an
   entry, fall back to accepting all VLANs.
   If we successfully configure all entries from the list, turn the
   fallback off.
 - Use the same code for reconfiguring VLANs during NIC load.

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Acked-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-05 23:11:39 -04:00
Marco Angaroni
3ec10d3a2b ipvs: update real-server binding of outgoing connections in SIP-pe
Previous patch that introduced handling of outgoing packets in SIP
persistent-engine did not call ip_vs_check_template() in case packet was
matching a connection template. Assumption was that real-server was
healthy, since it was sending a packet just in that moment.

There are however real-server fault conditions requiring that association
between call-id and real-server (represented by connection template)
gets updated. Here is an example of the sequence of events:
  1) RS1 is a back2back user agent that handled call-id1 and call-id2
  2) RS1 is down and was marked as unavailable
  3) new message from outside comes to IPVS with call-id1
  4) IPVS reschedules the message to RS2, which becomes new call handler
  5) RS2 forwards the message outside, translating call-id1 to call-id2
  6) inside pe->conn_out() IPVS matches call-id2 with existing template
  7) IPVS does not change association call-id2 <-> RS1
  8) new message comes from client with call-id2
  9) IPVS reschedules the message to a real-server potentially different
     from RS2, which is now the correct destination

This patch introduces ip_vs_check_template() call in the handling of
outgoing packets for SIP-pe. And also introduces a second optional
argument for ip_vs_check_template() that allows to check if dest
associated to a connection template is the same dest that was identified
as the source of the packet. This is to change the real-server bound to a
particular call-id independently from its availability status: the idea
is that it's more reliable, for in->out direction (where internal
network can be considered trusted), to always associate a call-id with
the last real-server that used it in one of its messages. Think about
above sequence of events where, just after step 5, RS1 returns instead
to be available.

Comparison of dests is done by simply comparing pointers to struct
ip_vs_dest; there should be no cases where struct ip_vs_dest keeps its
memory address, but represent a different real-server in terms of
ip-address / port.

Fixes: 39b9722315 ("ipvs: handle connections started by real-servers")
Signed-off-by: Marco Angaroni <marcoangaroni@gmail.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2016-06-06 09:47:25 +09:00
David S. Miller
4ef36e1566 wireless-drivers fixes for 4.7
brcmfmac
 
 * add fallback RSSI report for devices that do not report per-chain values
 * fix a null pointer derefence regression on PCIe full dongle devices
 
 rtlwifi
 
 * fix scheduling while atomic regression from commit 49f86ec21c
 
 MAINTAINERS
 
 * add file patterns for wireless device tree bindings
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQEcBAABAgAGBQJXUvUsAAoJEG4XJFUm622bdKAH/R2UKYx3t73llbqlpgkSCOxZ
 wRMleoRZRF8DI9lT4QUxGCq5wxED0U2TtDJh7LJJ9I5sAY1n50w+e2TTZ5r6ftXo
 1J79NvvrbVM8227shburpveyxzeQkLtI+DDkP07nMtF3VNxOrU9+z3TPM27QR1LM
 0pq/yrKZ7Qnrf9gf4oTeH0dQKYmA4Om/HXjnMU4Sxi/vhKQBfcemjHacv37BTG/F
 4PrFzC17O7wVFDeKKXXieK1Z9inpocZMG9OceHXxi9fwrT+RhMnJdrDarSi4b92W
 m1yiRcS0389R/jEG3O2LRVNIJVnBHdQmEU8Bg8nup377sFxX/weWVYagaX4f2bE=
 =AObt
 -----END PGP SIGNATURE-----

Merge tag 'wireless-drivers-for-davem-2016-06-04' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers

Kalle Valo says:

====================
wireless-drivers fixes for 4.7

brcmfmac

* add fallback RSSI report for devices that do not report per-chain values
* fix a null pointer derefence regression on PCIe full dongle devices

rtlwifi

* fix scheduling while atomic regression from commit 49f86ec21c

MAINTAINERS

* add file patterns for wireless device tree bindings
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-04 22:14:53 -04:00
Geert Uytterhoeven
182fd9eecb MAINTAINERS: Add file patterns for wireless device tree bindings
Submitters of device tree binding documentation may forget to CC
the subsystem maintainer if this is missing.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: linux-wireless@vger.kernel.org
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-06-04 17:24:02 +03:00
David S. Miller
e7eacc9e0b Merge branch 'mediatek-fixes'
John Crispin says:

====================
net-next: mediatek: improve phy support

The current driver did not handle the RGMII delay modes and asymmetric flow
control properly. The mii_bus is not freed properly. Also add support for
fixed-phy allowing the driver to work on SoCs that have an internal gigabit
switch.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-03 19:54:23 -04:00
John Crispin
37920fce0f net-next: mediatek: properly handle RGMII modes
If an external Gigabit PHY is connected to either of the MACs we need to
be able to tell the PHY to use a delay. Not doing so will result in heavy
packet loss and/or data corruption when using PHYs such as the IC+ IP1001.
We tell the PHY which MII delay mode to use via the devictree.

The ethernet driver needs to be adapted to handle all 3 rgmii-*id modes
in the same way as normal rgmii when setting up the MAC.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: John Crispin <john@phrozen.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-03 19:54:16 -04:00
John Crispin
0c72c50f6f net-next: mediatek: add fixed-phy support
The MT7623 SoC has a builtin gigabit switch. If we want to use it, GMAC1
needs to be configured using a fixed link speed and flow control settings.
The easiest way to do this is to used the fixed-phy driver, allowing us to
reuse the existing mdio polling code to setup the MAC.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: John Crispin <john@phrozen.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-03 19:54:16 -04:00
John Crispin
08ef55c6f2 net-next: mediatek: fix gigabit and flow control advertisement
The current code will not setup the PHYs advertisement features correctly.
Fix this and properly advertise Gigabit features and properly handle
asymmetric pause frames.

Signed-off-by: Sean Wang <keyhaede@gmail.com>
Signed-off-by: John Crispin <john@phrozen.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-03 19:54:16 -04:00
John Crispin
207bdf1844 net-next: mediatek: use mdiobus_free() in favour of kfree()
The driver currently uses kfree() to clear the mii_bus. This is not the
correct way to clear the memory and mdiobus_free() should be used instead.
This patch fixes the two instances where this happens in the driver.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: John Crispin <john@phrozen.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-03 19:54:16 -04:00
Ivan Khoronzhuk
8478b6cdc1 net: ethernet: ti: cpsw: fix rx-usecs interrupt pacing consistency
The rx-usecs shouldn't be changed while interface down/up.
Currently, for instance, if it's set to 100us, after interface
down/up it's 500us. It's a hidden bug that can lead to lavish
interrupt pacing time increasing while "down/up" up to max value.

Steps to reproduce:
- set rx-usecs to be 100us
- down/up interface
- read new unexpected rx-usecs

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-03 19:35:06 -04:00
Yangbo Lu
9c8b0778e4 gianfar: fix the last transmit buffer descriptor
When the transmit hardware timestamping is enabled, an additional
TxBD would be added and would be set as the last TxBD with TXBD_LAST
and TXBD_INTERRUPT. However this has been broken by a patch recently.
This made the software couldn't get transmit hardware timestamps and
resulted in call trace. So, this patch is to fix this issue.

Fixes: 48963b4492 ("gianfar: Remove redundant ops for do_tstamp
       from xmit()")
Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-03 19:32:23 -04:00
WANG Cong
8d5958f424 sch_tbf: update backlog as well
Fixes: 2ccccf5fb4 ("net_sched: update hierarchical backlog too")
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-03 19:24:04 -04:00
WANG Cong
d7f4f332f0 sch_red: update backlog as well
Fixes: 2ccccf5fb4 ("net_sched: update hierarchical backlog too")
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-03 19:24:04 -04:00
WANG Cong
6a73b571b6 sch_drr: update backlog as well
Fixes: 2ccccf5fb4 ("net_sched: update hierarchical backlog too")
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-03 19:24:04 -04:00
WANG Cong
6529d75ad9 sch_prio: update backlog as well
We need to update backlog too when we update qlen.

Joint work with Stas.

Reported-by: Stas Nichiporovich <stasn77@gmail.com>
Tested-by: Stas Nichiporovich <stasn77@gmail.com>
Fixes: 2ccccf5fb4 ("net_sched: update hierarchical backlog too")
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-03 19:24:04 -04:00