Commit graph

602333 commits

Author SHA1 Message Date
Sudarsana Reddy Kalluru
34c7bb4705 qed: Protect the doorbell BAR with the write barriers.
SPQ doorbell is currently protected with the compilation barrier. Under the
stress scenarios, we may get into a state where (due to the weak ordering)
several ramrod doorbells were written to the BAR with an out-of-order
producer values. Need to change the barrier type to a write barrier to make
sure that the write buffer is flushed after each doorbell.

Signed-off-by: Sudarsana Reddy Kalluru <sudarsana.kalluru@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 08:12:45 -04:00
David Barroso
b560f03ddf neigh: Explicitly declare RCU-bh read side critical section in neigh_xmit()
neigh_xmit() expects to be called inside an RCU-bh read side critical
section, and while one of its two current callers gets this right, the
other one doesn't.

More specifically, neigh_xmit() has two callers, mpls_forward() and
mpls_output(), and while both callers call neigh_xmit() under
rcu_read_lock(), this provides sufficient protection for neigh_xmit()
only in the case of mpls_forward(), as that is always called from
softirq context and therefore doesn't need explicit BH protection,
while mpls_output() can be called from process context with softirqs
enabled.

When mpls_output() is called from process context, with softirqs
enabled, we can be preempted by a softirq at any time, and RCU-bh
considers the completion of a softirq as signaling the end of any
pending read-side critical sections, so if we do get a softirq
while we are in the part of neigh_xmit() that expects to be run inside
an RCU-bh read side critical section, we can end up with an unexpected
RCU grace period running right in the middle of that critical section,
making things go boom.

This patch fixes this impedance mismatch in the callee, by making
neigh_xmit() always take rcu_read_{,un}lock_bh() around the code that
expects to be treated as an RCU-bh read side critical section, as this
seems a safer option than fixing it in the callers.

Fixes: 4fd3d7d9e8 ("neigh: Add helper function neigh_xmit")
Signed-off-by: David Barroso <dbarroso@fastly.com>
Signed-off-by: Lennert Buytenhek <lbuytenhek@fastly.com>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
Acked-by: Robert Shearman <rshearma@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 07:58:28 -04:00
Jarod Wilson
889ad45666 e1000e: keep VLAN interfaces functional after rxvlan off
I've got a bug report about an e1000e interface, where a VLAN interface is
set up on top of it:

$ ip link add link ens1f0 name ens1f0.99 type vlan id 99
$ ip link set ens1f0 up
$ ip link set ens1f0.99 up
$ ip addr add 192.168.99.92 dev ens1f0.99

At this point, I can ping another host on vlan 99, ip 192.168.99.91.
However, if I do the following:

$ ethtool -K ens1f0 rxvlan off

Then no traffic passes on ens1f0.99. It comes back if I toggle rxvlan on
again. I'm not sure if this is actually intended behavior, or if there's a
lack of software VLAN stripping fallback, or what, but things continue to
work if I simply don't call e1000e_vlan_strip_disable() if there are
active VLANs (plagiarizing a function from the e1000 driver here) on the
interface.

Also slipped a related-ish fix to the kerneldoc text for
e1000e_vlan_strip_disable here...

Signed-off-by: Jarod Wilson <jarod@redhat.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 07:39:48 -04:00
Dan Carpenter
5b4d10f5e0 qlcnic: use the correct ring in qlcnic_83xx_process_rcv_ring_diag()
There is a static checker warning here "warn: mask and shift to zero"
and the code sets "ring" to zero every time.  From looking at how
QLCNIC_FETCH_RING_ID() is used in qlcnic_83xx_process_rcv_ring() the
qlcnic_83xx_hndl() should be removed.

Fixes: 4be41e92f7 ('qlcnic: 83xx data path routines')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 05:46:16 -04:00
Daniel Borkmann
ceb5607035 bpf, perf: delay release of BPF prog after grace period
Commit dead9f29dd ("perf: Fix race in BPF program unregister") moved
destruction of BPF program from free_event_rcu() callback to __free_event(),
which is problematic if used with tail calls: if prog A is attached as
trace event directly, but at the same time present in a tail call map used
by another trace event program elsewhere, then we need to delay destruction
via RCU grace period since it can still be in use by the program doing the
tail call (the prog first needs to be dropped from the tail call map, then
trace event with prog A attached destroyed, so we get immediate destruction).

Fixes: dead9f29dd ("perf: Fix race in BPF program unregister")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Cc: Jann Horn <jann@thejh.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 05:42:55 -04:00
Nikolay Aleksandrov
565ce8f32a net: bridge: fix vlan stats continue counter
I made a dumb off-by-one mistake when I added the vlan stats counter
dumping code. The increment should happen before the check, not after
otherwise we miss one entry when we continue dumping.

Fixes: a60c090361 ("bridge: netlink: export per-vlan stats")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 05:33:35 -04:00
Eric Dumazet
a3d2e9f8eb tcp: do not send too big packets at retransmit time
Arjun reported a bug in TCP stack and bisected it to a recent commit.

In case where we process SACK, we can coalesce multiple skbs
into fat ones (tcp_shift_skb_data()), to lower write queue
overhead, because we do not expect to retransmit these packets.

However, SACK reneging can happen, forcing the sender to retransmit
all these packets. If skb->len is above 64KB, we then send buggy
IP packets that could hang TSO engine on cxgb4.

Neal suggested to use tcp_tso_autosize() instead of tp->gso_segs
so that we cook packets of optimal size vs TCP/pacing.

Thanks to Arjun for reporting the bug and running the tests !

Fixes: 10d3be5692 ("tcp-tso: do not split TSO packets at retransmit time")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Arjun V <arjun@chelsio.com>
Tested-by: Arjun V <arjun@chelsio.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 05:25:11 -04:00
Wei Yongjun
96183182ad ibmvnic: fix to use list_for_each_safe() when delete items
Since we will remove items off the list using list_del() we need
to use a safe version of the list_for_each() macro aptly named
list_for_each_safe().

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 05:23:42 -04:00
David S. Miller
b2c1b30e3e Merge branch 'thunderx-fixes'
Sunil Goutham says:

====================
net: thunderx: Miscellaneous fixes

This 2 patch series fixes issues w.r.t physical link status
reporting and transmit datapath configuration for
secondary qsets.

Changes from v1:
Fixed lmac disable sequence for interfaces of type SGMII.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 05:14:19 -04:00
Sunil Goutham
3e29adba56 net: thunderx: Fix TL4 configuration for secondary Qsets
TL4 calculation for a given SQ of secondary Qsets is incorrect
and goes out of bounds and also for some SQ's TL4 chosen will
transmit data via a different BGX interface and not same as
primary Qset's interface.

This patch fixes this issue.

Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 05:14:13 -04:00
Sunil Goutham
3f4c68cfde net: thunderx: Fix link status reporting
Check for SMU RX local/remote faults along with SPU LINK
status. Otherwise at times link is UP at our end but DOWN
at link partner's side. Also due to an issue in BGX it's
rarely seen that initialization doesn't happen properly
and SMU RX reports faults with everything fine at SPU.
This patch tries to reinitialize LMAC to fix it.

Also fixed LMAC disable sequence to properly bring down link.

Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: Tao Wang <tao.wang@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 05:14:13 -04:00
David S. Miller
f5074d0ce2 Merge branch 'mlx5-100G-fixes'
Saeed Mahameed says:

====================
Mellanox 100G mlx5 fixes#2 for 4.7-rc

The following series provides one-liners fixes for mlx5 driver plus one
medium patch to reorganize ethtool counters reporting.

Highlights:
	- Added MODIFY_FLOW_TABLE to command strings table
	- Add ConnectX-5 PCIe 4.0 to list of supported devices
	- Rename ASYNC_EVENTS enum
	- Enable BlueFlame only when supported by device
	- Avoid adding same vxlan port twice
	- Report the correct number of PFC counters
	- Reorganize ethtool reported counters and remove duplications
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 04:28:59 -04:00
Gal Pressman
bfe6d8d1d4 net/mlx5e: Reorganize ethtool statistics
Categorize and reorganize ethtool statistics counters by renaming to
"rx_*" and "tx_*" and removing redundant and duplicated counters, this
way they are easier to grasp and more user friendly.

Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 04:28:47 -04:00
Gal Pressman
ed80ec4c17 net/mlx5e: Fix number of PFC counters reported to ethtool
Number of PFC counters used to count only number of priorities with PFC
enabled, but each priority has more than one counter, hence the need to
multiply it by the number of PFC counters per priority.

Fixes: cf678570d5 ('net/mlx5e: Add per priority group to PPort counters')
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 04:28:46 -04:00
Matthew Finlay
9ceec359e4 net/mlx5e: Prevent adding the same vxlan port
Do not allow the same vxlan udp port to be added to the device more than
once.

Fixes: b3f63c3d5e ("net/mlx5e: Add netdev support for VXLAN tunneling")
Signed-off-by: Matthew Finlay <matt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 04:28:46 -04:00
Gal Pressman
fd4782c213 net/mlx5e: Check for BlueFlame capability before allocating SQ uar
Previous to this patch mapping was always set to write combining without
checking whether BlueFlame is supported in the device.

Fixes: 0ba422410b ('net/mlx5: Fix global UAR mapping')
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 04:28:46 -04:00
Eli Cohen
e0f46eb9f6 net/mlx5e: Change enum to better reflect usage
Change MLX5E_STATE_ASYNC_EVENTS_ENABLE to
MLX5E_STATE_ASYNC_EVENTS_ENABLED since it represent a state and not an
operation.

Fixes: acff797cd1 ('net/mlx5: Extend mlx5_core to support ConnectX-4 Ethernet functionality')
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 04:28:46 -04:00
Majd Dibbiny
7092fe8669 net/mlx5: Add ConnectX-5 PCIe 4.0 to list of supported devices
Add the upcoming ConnectX-5 PCIe 4.0 device to the list of
supported devices by the mlx5 driver.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 04:28:46 -04:00
Eli Cohen
5be1ea899d net/mlx5: Update command strings
Add command string for MODIFY_FLOW_TABLE which is used by the driver.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 04:28:46 -04:00
Harini Katakam
3ec0a0f10c net: marvell: Add separate config ANEG function for Marvell 88E1111
Marvell 88E1111 currently uses the generic marvell config ANEG function.
This function has a sequence accessing Page 5 and Register 31,
both of which are not defined or reserved for this PHY.
Hence this patch adds a new config ANEG function for Marvell 88E1111
without these erroneous accesses.

Signed-off-by: Harini Katakam <harinik@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 04:07:36 -04:00
David S. Miller
d10f0b312a Merge branch 'batman-adv-fixes'
Sven Eckelmann says:

====================
batman-adv: Fixes for Linux 4.7

Antonio currently seems to be occupied. This is currently rather unfortunate
because there are patches waiting in the batman-adv development repository
maint(enance) branch [1] since up to 6 weeks. I am now getting asked when
these patches will hit the distribution kernels and therefore decided to
submit these patches directly to netdev.

The patch from Simon works around the problem that warnings could be triggered
in the translation table code via packets using a VLAN not configured on the
target host. This warning was replaced with a rate limited info message.

Ben Hutchings found an superfluous batadv_softif_vlan_put in the error
handling code of the translation table while he backported the "batman-adv:
Fix reference counting of vlan object for tt_local_entry" patch to the stable
kernels. He noticed correctly that this batadv_softif_vlan_put should also
have been removed by the said patch.

The most requested fix at the moment is related to a double free in the
translation table code. It is a race condition which mostly happens on systems
with multiple cores and multiple network interface attached to batman-adv. Two
Freifunk communities which were haunted by weird crashes (with backtraces
reporting problems in other parts of the kernel) were kind enough to test this
patch. They reported that there systems are now running stable after applying
this patch.

An invalid memory access was detected in the batadv_icmp_packet_rr handling
code when receiving a skbuff with fragments. The last patch is fixing a memory
leak when the interface is removed via .dellink. The code to fix it was copied
from the code handling the legacy sysfs interface to remove netdevices from a
batman-adv netdevice.

There are still 28 patches in the development tree for v4.8 but I will leave
them to Antonio because these are cleanups and features and therefore for net-
next.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 04:01:53 -04:00
Sven Eckelmann
420cb1b764 batman-adv: Clean up untagged vlan when destroying via rtnl-link
The untagged vlan object is only destroyed when the interface is removed
via the legacy sysfs interface. But it also has to be destroyed when the
standard rtnl-link interface is used.

Fixes: 5d2c05b213 ("batman-adv: add per VLAN interface attribute framework")
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Acked-by: Antonio Quartulli <a@unstable.cc>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 04:01:48 -04:00
Sven Eckelmann
3b55e44220 batman-adv: Fix ICMP RR ethernet access after skb_linearize
The skb_linearize may reallocate the skb. This makes the calculated pointer
for ethhdr invalid. But it the pointer is used later to fill in the RR
field of the batadv_icmp_packet_rr packet.

Instead re-evaluate eth_hdr after the skb_linearize+skb_cow to fix the
pointer and avoid the invalid read.

Fixes: da6b8c20a5 ("batman-adv: generalize batman-adv icmp packet handling")
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 04:01:48 -04:00
Ben Hutchings
baceced932 batman-adv: Fix double-put of vlan object
Each batadv_tt_local_entry hold a single reference to a
batadv_softif_vlan.  In case a new entry cannot be added to the hash
table, the error path puts the reference, but the reference will also
now be dropped by batadv_tt_local_entry_release().

Fixes: a33d970d0b ("batman-adv: Fix reference counting of vlan object for tt_local_entry")
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 04:01:47 -04:00
Sven Eckelmann
9c4604a298 batman-adv: Fix use-after-free/double-free of tt_req_node
The tt_req_node is added and removed from a list inside a spinlock. But the
locking is sometimes removed even when the object is still referenced and
will be used later via this reference. For example batadv_send_tt_request
can create a new tt_req_node (including add to a list) and later
re-acquires the lock to remove it from the list and to free it. But at this
time another context could have already removed this tt_req_node from the
list and freed it.

CPU#0

    batadv_batman_skb_recv from net_device 0
    -> batadv_iv_ogm_receive
      -> batadv_iv_ogm_process
        -> batadv_iv_ogm_process_per_outif
          -> batadv_tvlv_ogm_receive
            -> batadv_tvlv_ogm_receive
              -> batadv_tvlv_containers_process
                -> batadv_tvlv_call_handler
                  -> batadv_tt_tvlv_ogm_handler_v1
                    -> batadv_tt_update_orig
                      -> batadv_send_tt_request
                        -> batadv_tt_req_node_new
                           spin_lock(...)
                           allocates new tt_req_node and adds it to list
                           spin_unlock(...)
                           return tt_req_node

CPU#1

    batadv_batman_skb_recv from net_device 1
    -> batadv_recv_unicast_tvlv
      -> batadv_tvlv_containers_process
        -> batadv_tvlv_call_handler
          -> batadv_tt_tvlv_unicast_handler_v1
            -> batadv_handle_tt_response
               spin_lock(...)
               tt_req_node gets removed from list and is freed
               spin_unlock(...)

CPU#0

                      <- returned to batadv_send_tt_request
                         spin_lock(...)
                         tt_req_node gets removed from list and is freed
                         MEMORY CORRUPTION/SEGFAULT/...
                         spin_unlock(...)

This can only be solved via reference counting to allow multiple contexts
to handle the list manipulation while making sure that only the last
context holding a reference will free the object.

Fixes: a73105b8d4 ("batman-adv: improved client announcement mechanism")
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Tested-by: Martin Weinelt <martin@darmstadt.freifunk.net>
Tested-by: Amadeus Alfa <amadeus@chemnitz.freifunk.net>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 04:01:47 -04:00
Simon Wunderlich
0b3dd7dfb8 batman-adv: replace WARN with rate limited output on non-existing VLAN
If a VLAN tagged frame is received and the corresponding VLAN is not
configured on the soft interface, it will splat a WARN on every packet
received. This is a quite annoying behaviour for some scenarios, e.g. if
bat0 is bridged with eth0, and there are arbitrary VLAN tagged frames
from Ethernet coming in without having any VLAN configuration on bat0.

The code should probably create vlan objects on the fly and
transparently transport these VLAN-tagged Ethernet frames, but until
this is done, at least the WARN splat should be replaced by a rate
limited output.

Fixes: 354136bcc3 ("batman-adv: fix kernel crash due to missing NULL checks")
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 04:01:47 -04:00
Florian Fainelli
69fc58a57e net: phy: Manage fixed PHY address space using IDA
If we have a system which uses fixed PHY devices and calls
fixed_phy_register() then fixed_phy_unregister() we can exhaust the
number of fixed PHYs available after a while, since we keep incrementing
the variable phy_fixed_addr, but we never decrement it.

This patch fixes that by converting the fixed PHY allocation to using
IDA, which takes care of the allocation/dealloaction of the PHY
addresses for us.

Fixes: a759512174 ("net: phy: extend fixed driver with fixed_phy_register()")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 03:51:40 -04:00
Willem de Bruijn
9a0fee2b55 sock_diag: do not broadcast raw socket destruction
Diag intends to broadcast tcp_sk and udp_sk socket destruction.
Testing sk->sk_protocol for IPPROTO_TCP/IPPROTO_UDP alone is not
sufficient for this. Raw sockets can have the same type.

Add a test for sk->sk_type.

Fixes: eb4cb00852 ("sock_diag: define destruction multicast groups")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-28 09:08:51 -04:00
Aaron Campbell
ab8ed95108 connector: fix out-of-order cn_proc netlink message delivery
The proc connector messages include a sequence number, allowing userspace
programs to detect lost messages.  However, performing this detection is
currently more difficult than necessary, since netlink messages can be
delivered to the application out-of-order.  To fix this, leave pre-emption
disabled during cn_netlink_send(), and use GFP_NOWAIT.

The following was written as a test case.  Building the kernel w/ make -j32
proved a reliable way to generate out-of-order cn_proc messages.

int
main(int argc, char *argv[])
{
	static uint32_t last_seq[CPU_SETSIZE], seq;
	int cpu, fd;
	struct sockaddr_nl sa;
	struct __attribute__((aligned(NLMSG_ALIGNTO))) {
		struct nlmsghdr nl_hdr;
		struct __attribute__((__packed__)) {
			struct cn_msg cn_msg;
			struct proc_event cn_proc;
		};
	} rmsg;
	struct __attribute__((aligned(NLMSG_ALIGNTO))) {
		struct nlmsghdr nl_hdr;
		struct __attribute__((__packed__)) {
			struct cn_msg cn_msg;
			enum proc_cn_mcast_op cn_mcast;
		};
	} smsg;

	fd = socket(PF_NETLINK, SOCK_DGRAM, NETLINK_CONNECTOR);
	if (fd < 0) {
		perror("socket");
	}

	sa.nl_family = AF_NETLINK;
	sa.nl_groups = CN_IDX_PROC;
	sa.nl_pid = getpid();
	if (bind(fd, (struct sockaddr *)&sa, sizeof(sa)) < 0) {
		perror("bind");
	}

	memset(&smsg, 0, sizeof(smsg));
	smsg.nl_hdr.nlmsg_len = sizeof(smsg);
	smsg.nl_hdr.nlmsg_pid = getpid();
	smsg.nl_hdr.nlmsg_type = NLMSG_DONE;
	smsg.cn_msg.id.idx = CN_IDX_PROC;
	smsg.cn_msg.id.val = CN_VAL_PROC;
	smsg.cn_msg.len = sizeof(enum proc_cn_mcast_op);
	smsg.cn_mcast = PROC_CN_MCAST_LISTEN;
	if (send(fd, &smsg, sizeof(smsg), 0) != sizeof(smsg)) {
		perror("send");
	}

	while (recv(fd, &rmsg, sizeof(rmsg), 0) == sizeof(rmsg)) {
		cpu = rmsg.cn_proc.cpu;
		if (cpu < 0) {
			continue;
		}
		seq = rmsg.cn_msg.seq;
		if ((last_seq[cpu] != 0) && (seq != last_seq[cpu] + 1)) {
			printf("out-of-order seq=%d on cpu=%d\n", seq, cpu);
		}
		last_seq[cpu] = seq;
	}

	/* NOTREACHED */

	perror("recv");

	return -1;
}

Signed-off-by: Aaron Campbell <aaron@monkey.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-28 08:48:33 -04:00
daniel
0888d5f3c0 Bridge: Fix ipv6 mc snooping if bridge has no ipv6 address
The bridge is falsly dropping ipv6 mulitcast packets if there is:
 1. No ipv6 address assigned on the brigde.
 2. No external mld querier present.
 3. The internal querier enabled.

When the bridge fails to build mld queries, because it has no
ipv6 address, it slilently returns, but keeps the local querier enabled.
This specific case causes confusing packet loss.

Ipv6 multicast snooping can only work if:
 a) An external querier is present
 OR
 b) The bridge has an ipv6 address an is capable of sending own queries

Otherwise it has to forward/flood the ipv6 multicast traffic,
because snooping cannot work.

This patch fixes the issue by adding a flag to the bridge struct that
indicates that there is currently no ipv6 address assinged to the bridge
and returns a false state for the local querier in
__br_multicast_querier_exists().

Special thanks to Linus Lüssing.

Fixes: d1d81d4c3d ("bridge: check return value of ipv6_dev_get_saddr()")
Signed-off-by: Daniel Danzberger <daniel@dd-wrt.com>
Acked-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-28 08:03:04 -04:00
Wang Sheng-Hui
f299a02d5f net/mlx5: use mlx5_buf_alloc_node instead of mlx5_buf_alloc in mlx5_wq_ll_create
Commit 311c7c71c9 ("net/mlx5e: Allocate DMA coherent memory on
reader NUMA node") introduced mlx5_*_alloc_node() but missed changing
some calling and warn messages. This patch introduces 2 changes:
	* Use mlx5_buf_alloc_node() instead of mlx5_buf_alloc() in
	  mlx5_wq_ll_create()
	* Update the failure warn messages with _node postfix for
	  mlx5_*_alloc function names

Fixes: 311c7c71c9 ("net/mlx5e: Allocate DMA coherent memory on reader NUMA node")
Signed-off-by: Wang Sheng-Hui <shhuiw@foxmail.com>
Acked-By: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-28 05:17:38 -04:00
David S. Miller
d1b5a8da29 Merge branch 'bgmac-fixes'
Florian Fainelli says:

====================
net: bgmac: Random fixes

This patch series fixes a few issues spotted by code inspection and
actual testing.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-28 04:22:25 -04:00
Florian Fainelli
3894396e64 net: bgmac: Remove superflous netif_carrier_on()
bgmac_open() calls phy_start() to initialize the PHY state machine,
which will set the interface's carrier state accordingly, no need to
force that as this could be conflicting with the PHY state determined by
PHYLIB.

Fixes: dd4544f054 ("bgmac: driver for GBit MAC core on BCMA bus")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-28 04:22:17 -04:00
Florian Fainelli
c3897f2a69 net: bgmac: Start transmit queue in bgmac_open
The driver does not start the transmit queue in bgmac_open(). If the
queue was stopped prior to closing then re-opening the interface, we
would never be able to wake-up again.

Fixes: dd4544f054 ("bgmac: driver for GBit MAC core on BCMA bus")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-28 04:22:17 -04:00
Florian Fainelli
d2b1323387 net: bgmac: Fix SOF bit checking
We are checking for the Start of Frame bit in the ctl1 word, while this
bit is set in the ctl0 word instead. Read the ctl0 word and update the
check to verify that.

Fixes: 9cde94506e ("bgmac: implement scatter/gather support")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-28 04:22:17 -04:00
Jay Vosburgh
0622cab034 bonding: fix 802.3ad aggregator reselection
Since commit 7bb11dc9f5 ("bonding: unify all places where
actor-oper key needs to be updated."), the logic in bonding to handle
selection between multiple aggregators has not functioned.

	This affects only configurations wherein the bonding slaves
connect to two discrete aggregators (e.g., two independent switches, each
with LACP enabled), thus creating two separate aggregation groups within a
single bond.

	The cause is a change in 7bb11dc9f5 to no longer set
AD_PORT_BEGIN on a port after a link state change, which would cause the
port to be reselected for attachment to an aggregator as if were newly
added to the bond.  We cannot restore the prior behavior, as it
contradicts IEEE 802.1AX 5.4.12, which requires ports that "become
inoperable" (lose carrier, setting port_enabled=false as per 802.1AX
5.4.7) to remain selected (i.e., assigned to the aggregator).  As the port
now remains selected, the aggregator selection logic is not invoked.

	A side effect of this change is that aggregators in bonding will
now contain ports that are link down.  The aggregator selection logic
does not currently handle this situation correctly, causing incorrect
aggregator selection.

	This patch makes two changes to repair the aggregator selection
logic in bonding to function as documented and within the confines of the
standard:

	First, the aggregator selection and related logic now utilizes the
number of active ports per aggregator, not the number of selected ports
(as some selected ports may be down).  The ad_select "bandwidth" and
"count" options only consider ports that are link up.

	Second, on any carrier state change of any slave, the aggregator
selection logic is explicitly called to insure the correct aggregator is
active.

Reported-by: Veli-Matti Lintu <veli-matti.lintu@opinsys.fi>
Fixes: 7bb11dc9f5 ("bonding: unify all places where actor-oper key needs to be updated.")
Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-28 04:19:18 -04:00
Tom Goff
70a0dec451 ipmr/ip6mr: Initialize the last assert time of mfc entries.
This fixes wrong-interface signaling on 32-bit platforms for entries
created when jiffies > 2^31 + MFC_ASSERT_THRESH.

Signed-off-by: Tom Goff <thomas.goff@ll.mit.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-28 04:14:09 -04:00
Stefan Hajnoczi
4192f672fa vsock: make listener child lock ordering explicit
There are several places where the listener and pending or accept queue
child sockets are accessed at the same time.  Lockdep is unhappy that
two locks from the same class are held.

Tell lockdep that it is safe and document the lock ordering.

Originally Claudio Imbrenda <imbrenda@linux.vnet.ibm.com> sent a similar
patch asking whether this is safe.  I have audited the code and also
covered the vsock_pending_work() function.

Suggested-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-27 10:44:46 -04:00
Paolo Abeni
48f1dcb55a ipv6: enforce egress device match in per table nexthop lookups
with the commit 8c14586fc3 ("net: ipv6: Use passed in table for
nexthop lookups"), net hop lookup is first performed on route creation
in the passed-in table.
However device match is not enforced in table lookup, so the found
route can be later discarded due to egress device mismatch and no
global lookup will be performed.
This cause the following to fail:

ip link add dummy1 type dummy
ip link add dummy2 type dummy
ip link set dummy1 up
ip link set dummy2 up
ip route add 2001:db8:8086::/48 dev dummy1 metric 20
ip route add 2001:db8:d34d::/64 via 2001:db8:8086::2 dev dummy1 metric 20
ip route add 2001:db8:8086::/48 dev dummy2 metric 21
ip route add 2001:db8:d34d::/64 via 2001:db8:8086::2 dev dummy2 metric 21
RTNETLINK answers: No route to host

This change fixes the issue enforcing device lookup in
ip6_nh_lookup_table()

v1->v2: updated commit message title

Fixes: 8c14586fc3 ("net: ipv6: Use passed in table for nexthop lookups")
Reported-and-tested-by: Beniamino Galvani <bgalvani@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-27 10:37:20 -04:00
David S. Miller
5aa3e24928 linux-can-fixes-for-4.7-20160623
-----BEGIN PGP SIGNATURE-----
 
 iQEcBAABCgAGBQJXa6kLAAoJED07qiWsqSVqRsIH/RiHvKa9VB7yYQaXV3YqUPIo
 iizU6mQCeODqZsDw9bXce232RevKBteYDyr4YpC4f9mX54CrQI7WRN7ev5fKU49a
 FB4M9uz8v3kS5XX8gADkuDvSwtrQ7pMz1fXM2rkEyHT/xf6egCOT/lpI/mWQuNcM
 3mkMFLy5ZUAaVHAsfqu8TrDgeWMDXNxbVwGtB/AuoFJ62pqVf5M+TwzKrYaOFM4r
 Rbl3NINKwFwk41KCOz20GiVvvahCp05SPHmK0OMwxsffKZmmkUOdHvusOZx7Zxnw
 RY7Mc/j+OvvAHYnRaZmfdDEPXc2hKQP0ATjVsW/bju7PWoVpG+87mYqubIFuTSY=
 =B2gO
 -----END PGP SIGNATURE-----

Merge tag 'linux-can-fixes-for-4.7-20160623' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
pull-request: can 2016-06-23

this is a pull request of 3 patches for the upcoming linux-4.7 release.

The first two patches are by Oliver Hartkopp fixing oopes in the generic CAN
device netlink handling. Jimmy Assarsson's patch for the kvaser_usb driver adds
support for more devices by adding their USB product ids.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-27 10:05:55 -04:00
Jeremy Linton
a37503bc38 net: smsc911x: Fix bug where PHY interrupts are overwritten by 0
By default, mdiobus_alloc() sets the PHYs to polling mode, but a
pointer size memcpy means that a couple IRQs end up being overwritten
with a value of 0. This means that PHY_POLL is disabled and results
in unpredictable behavior depending on the PHY's location on the
MDIO bus. Remove that memcpy and the now unused phy_irq member to
force the SMSC911x PHYs into polling mode 100% of the time.

Fixes: e7f4dc3536 ("mdio: Move allocation of interrupts into core")
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-27 04:21:15 -04:00
Or Gerlitz
c40e4096bf MAINTAINERS: Update Mellanox's mlx4 Eth NIC driver entry
Tariq Toukan is replacing Eugenia (Jenny) Emantayev as the mlx4
Ethernet driver maintainer, thanks to Jenny and good luck to him.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-23 15:55:07 -04:00
David S. Miller
00e8cb00bc wireless-drivers fixes for 4.7
iwlwifi
 
 * fix the scan timeout for long scans
 * fix an RCU splat caused when updating the TKIP key
 * fix a potential NULL-derefence introduced recently
 * fix a IGTK key bug that has existed since the MVM driver was introduced
 * fix some fw capabilities checks that got accidentally inverted
 
 rtl8xxxu
 
 * fix typo on variable name
 
 ath10k
 
 * fix deadlock when peer cannot be created
 * fix crash related to printing features
 * fix deadlock while processing rx_in_ord_ind
 
 ath9k
 
 * fix GPIO mask regression for AR9462 and AR9565
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQEcBAABAgAGBQJXaRGeAAoJEG4XJFUm622bc6MH/jBpajTua8/fEUwo8dEywKRD
 HULV6h5jDfoQ2N+LkKGff4UoAup4rWvzsXgTsNwPrma+TYi8M/eVrWanJ+TkwI31
 2jHh2ynBqAPNhM6oT/NKJgGPgamFsa7mvtM8wBZV4VZseIGhJcKExExLjnE64ZdG
 7o6VrtNRNtP+lnxT7ojbcS7cMnQqa7d32CqYjyJtABzLdSHNdww9euHLo9t6EFFa
 7dti3t4WftTZ0+VyZmNrLgS+RO0ix7Kbr+ZfNQPyq9DLAaSfNZR8kWpNZjR7G4BA
 QYffAkBO/iwffJS9b/VU+o8b32SV0TstTbJsEyvJcqkkTbngj2QhHZHBU012vQk=
 =uVoM
 -----END PGP SIGNATURE-----

Merge tag 'wireless-drivers-for-davem-2016-06-21' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers

Kalle Valo says:

====================
wireless-drivers fixes for 4.7

iwlwifi

* fix the scan timeout for long scans
* fix an RCU splat caused when updating the TKIP key
* fix a potential NULL-derefence introduced recently
* fix a IGTK key bug that has existed since the MVM driver was introduced
* fix some fw capabilities checks that got accidentally inverted

rtl8xxxu

* fix typo on variable name

ath10k

* fix deadlock when peer cannot be created
* fix crash related to printing features
* fix deadlock while processing rx_in_ord_ind

ath9k

* fix GPIO mask regression for AR9462 and AR9565
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-23 15:22:31 -04:00
Haishuang Yan
efeb2267bb geneve: fix tx_errors statistics
Tx errors present summation of errors encountered while transmitting
packets.

Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-23 15:16:12 -04:00
Chris Packham
52fe705b49 net: vrf: replace hard tab with space in assignment
The assignment of rth->dst.output in vrf_rt6_create() and
vrf_rtable_create() used a hard tab before the '='. The neighboring
assignments did not. Make the assignment of rth->dst.output consistent
with the surrounding code.

Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-23 15:14:43 -04:00
Eric Dumazet
21de12ee55 netem: fix a use after free
If the packet was dropped by lower qdisc, then we must not
access it later.

Save qdisc_pkt_len(skb) in a temp variable.

Fixes: 2ccccf5fb4 ("net_sched: update hierarchical backlog too")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: WANG Cong <xiyou.wangcong@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-23 15:07:44 -04:00
WANG Cong
817e9f2c5c act_ife: acquire ife_mod_lock before reading ifeoplist
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-23 12:02:36 -04:00
WANG Cong
067a7cd06f act_ife: only acquire tcf_lock for existing actions
Alexey reported that we have GFP_KERNEL allocation when
holding the spinlock tcf_lock. Actually we don't have
to take that spinlock for all the cases, especially
for the new one we just create. To modify the existing
actions, we still need this spinlock to make sure
the whole update is atomic.

For net-next, we can get rid of this spinlock because
we already hold the RTNL lock on slow path, and on fast
path we can use RCU to protect the metalist.

Joint work with Jamal.

Reported-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-23 12:02:36 -04:00
Herbert Xu
962fcef33b esp: Fix ESN generation under UDP encapsulation
Blair Steven noticed that ESN in conjunction with UDP encapsulation
is broken because we set the temporary ESP header to the wrong spot.

This patch fixes this by first of all using the right spot, i.e.,
4 bytes off the real ESP header, and then saving this information
so that after encryption we can restore it properly.

Fixes: 7021b2e1cd ("esp4: Switch to new AEAD interface")
Reported-by: Blair Steven <Blair.Steven@alliedtelesis.co.nz>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-23 11:52:00 -04:00
Jimmy Assarsson
71873a9b38 can: kvaser_usb: Add support for more Kvaser Leaf v2 devices
This patch adds support for Kvaser Leaf Light HS v2 OEM, Mini PCI
Express 2xHS and USBcan Light 2xHS.

Signed-off-by: Jimmy Assarsson <jimmyassarsson@gmail.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2016-06-23 11:16:41 +02:00