Commit graph

58785 commits

Author SHA1 Message Date
Narendra K
dffebd2c5c doc:networking: Update comment for dev_id field in netdevice.h
This patch updates the comment for 'dev_id' field in
'include/linux/netdevice.h' to reflect the intended
usage of 'dev_id'.

References: http://marc.info/?l=linux-netdev&m=136992115300526&w=2
References: http://marc.info/?l=linux-netdev&m=137062569014612&w=2

Signed-off-by: Narendra K <narendra_k@dell.com>
Reviewed-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-12 03:03:47 -07:00
Daniel Borkmann
da5bab079f net: udp4: move GSO functions to udp_offload
Similarly to TCP offloading and UDPv6 offloading, move all related
UDPv4 functions to udp_offload.c to make things more explicit. Also,
by this, we can make those functions static.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-12 00:47:25 -07:00
Eric Dumazet
e989707135 igmp: hash a hash table to speedup ip_check_mc_rcu()
After IP route cache removal, multicast applications using
a lot of multicast addresses hit a O(N) behavior in ip_check_mc_rcu()

Add a per in_device hash table to get faster lookup.

This hash table is created only if the number of items in mc_list is
above 4.

Reported-by: Shawn Bohrer <sbohrer@rgmadvisors.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Shawn Bohrer <sbohrer@rgmadvisors.com>
Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-12 00:25:23 -07:00
Eric Dumazet
130d3d68b5 net_sched: psched_ratecfg_precompute() improvements
Before allowing 64bits bytes rates, refactor
psched_ratecfg_precompute() to get better comments
and increased accuracy.

rate_bps field is renamed to rate_bytes_ps, as we only
have to worry about bytes per second.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ben Greear <greearb@candelatech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-11 22:39:47 -07:00
Eric Dumazet
45203a3b38 net_sched: add 64bit rate estimators
struct gnet_stats_rate_est contains u32 fields, so the bytes per second
field can wrap at 34360Mbit.

Add a new gnet_stats_rate_est64 structure to get 64bit bps/pps fields,
and switch the kernel to use this structure natively.

This structure is dumped to user space as a new attribute :

TCA_STATS_RATE_EST64

Old tc command will now display the capped bps (to 34360Mbit), instead
of wrapped values, and updated tc command will display correct
information.

Old tc command output, after patch :

eric:~# tc -s -d qd sh dev lo
qdisc pfifo 8001: root refcnt 2 limit 1000p
 Sent 80868245400 bytes 1978837 pkt (dropped 0, overlimits 0 requeues 0)
 rate 34360Mbit 189696pps backlog 0b 0p requeues 0

This patch carefully reorganizes "struct Qdisc" layout to get optimal
performance on SMP.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-11 02:51:03 -07:00
Gao feng
da12c90e09 netlink: Add compare function for netlink_table
As we know, netlink sockets are private resource of
net namespace, they can communicate with each other
only when they in the same net namespace. this works
well until we try to add namespace support for other
subsystems which use netlink.

Don't like ipv4 and route table.., it is not suited to
make these subsytems belong to net namespace, Such as
audit and crypto subsystems,they are more suitable to
user namespace.

So we must have the ability to make the netlink sockets
in same user namespace can communicate with each other.

This patch adds a new function pointer "compare" for
netlink_table, we can decide if the netlink sockets can
communicate with each other through this netlink_table
self-defined compare function.

The behavior isn't changed if we don't provide the compare
function for netlink_table.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-11 02:39:42 -07:00
Vlad Yasevich
867a59436f bridge: Add a flag to control unicast packet flood.
Add a flag to control flood of unicast traffic.  By default, flood is
on and the bridge will flood unicast traffic if it doesn't know
the destination.  When the flag is turned off, unicast traffic
without an FDB will not be forwarded to the specified port.

Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-11 02:04:32 -07:00
Vlad Yasevich
9ba18891f7 bridge: Add flag to control mac learning.
Allow user to control whether mac learning is enabled on the port.
By default, mac learning is enabled.  Disabling mac learning will
cause new dynamic FDB entries to not be created for a particular port.

Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-11 02:04:32 -07:00
Cong Wang
30f3a40f9a net: remove last caller of skb_tail_offset() and itself
Similar to the following commits:

commit 00f97da17a (netpoll: fix position of network header)
commit 525cebedb3 (pktgen: Fix position of ip and udp header)

using skb_tail_offset() seems not correct since the offset
is based on head pointer.

With the last caller removed, skb_tail_offset() can be killed
finally.

Cc: Thomas Graf <tgraf@suug.ch>
Cc: Daniel Borkmann <dborkmann@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-10 22:22:23 -07:00
Eliezer Tamir
0602129286 net: add low latency socket poll
Adds an ndo_ll_poll method and the code that supports it.
This method can be used by low latency applications to busy-poll
Ethernet device queues directly from the socket code.
sysctl_net_ll_poll controls how many microseconds to poll.
Default is zero (disabled).
Individual protocol support will be added by subsequent patches.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Tested-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-10 21:22:35 -07:00
Eliezer Tamir
af12fa6e46 net: add napi_id and hash
Adds a napi_id and a hashing mechanism to lookup a napi by id.
This will be used by subsequent patches to implement low latency
Ethernet device polling.
Based on a code sample by Eric Dumazet.

Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-10 21:22:35 -07:00
Jason Wang
815f236d62 macvtap: add TUNSETQUEUE ioctl
This patch adds TUNSETQUEUE ioctl to let userspace can temporarily disable or
enable a queue of macvtap. This is used to be compatible at API layer of tuntap
to simplify the userspace to manage the queues. This is done through introducing
a linked list to track all taps while using vlan->taps array to only track
active taps.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-07 23:49:09 -07:00
Jason Wang
f0afce01aa macvlan: change the max number of queues to 16
Macvtap should be at least compatible with tap, so change the max number to 16.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-07 23:49:09 -07:00
Jason Wang
1d2f41ed23 macvlan: switch to use IS_ENABLED()
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-07 23:49:08 -07:00
Daniel Borkmann
28850dc7c7 net: tcp: move GRO/GSO functions to tcp_offload
Would be good to make things explicit and move those functions to
a new file called tcp_offload.c, thus make this similar to tcpv6_offload.c.
While moving all related functions into tcp_offload.c, we can also
make some of them static, since they are only used there. Also, add
an explicit registration function.

Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-07 14:39:05 -07:00
Daniel Borkmann
350e780570 net: vlan: minor: remove unused HAVE_VLAN_PUT_TAG
Remove the definition of HAVE_VLAN_PUT_TAG since it's not
used or exported anywhere.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-07 14:39:05 -07:00
David S. Miller
93a306aef5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Merge 'net' into 'net-next' to get the MSG_CMSG_COMPAT
regression fix.

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-06 23:39:26 -07:00
Linus Torvalds
1612e111e4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fix from David Miller:
 "This is a quick one commit pull request to cure the regression
  introduced by the MSG_CMSG_COMPAT change."

(Background: commit 1be374a051 completely broke 32-bit COMPAT handling
by not only disallowing MSG_CMSG_COMPAT from user APIs, but clearing it
in our own internal use too!)

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  net: Unbreak compat_sys_{send,recv}msg
2013-06-06 18:09:05 -07:00
Andy Lutomirski
a7526eb5d0 net: Unbreak compat_sys_{send,recv}msg
I broke them in this commit:

    commit 1be374a051
    Author: Andy Lutomirski <luto@amacapital.net>
    Date:   Wed May 22 14:07:44 2013 -0700

        net: Block MSG_CMSG_COMPAT in send(m)msg and recv(m)msg

This patch adds __sys_sendmsg and __sys_sendmsg as common helpers that accept
MSG_CMSG_COMPAT and blocks MSG_CMSG_COMPAT at the syscall entrypoints.  It
also reverts some unnecessary checks in sys_socketcall.

Apparently I was suffering from underscore blindness the first time around.

Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Tested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-06 11:52:14 -07:00
David S. Miller
143554ace8 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Conflicts:
	net/netfilter/nf_log.c

The conflict in nf_log.c is that in 'net' we added CONFIG_PROC_FS
protection around foo_proc_entry() calls to fix a build failure,
whereas in Pablo's tree a guard if() test around a call is
remove_proc_entry() was removed.  Trivially resolved.

Pablo Neira Ayuso says:

====================
The following patchset contains the first batch of
Netfilter/IPVS updates for your net-next tree, they are:

* Three patches with improvements and code refactorization
  for nfnetlink_queue, from Florian Westphal.

* FTP helper now parses replies without brackets, as RFC1123
  recommends, from Jeff Mahoney.

* Rise a warning to tell everyone about ULOG deprecation,
  NFLOG has been already in the kernel tree for long time
  and supersedes the old logging over netlink stub, from
  myself.

* Don't panic if we fail to load netfilter core framework,
  just bail out instead, from myself.

* Add cond_resched_rcu, used by IPVS to allow rescheduling
  while walking over big hashtables, from Simon Horman.

* Change type of IPVS sysctl_sync_qlen_max sysctl to avoid
  possible overflow, from Zhang Yanfei.

* Use strlcpy instead of strncpy to skip zeroing of already
  initialized area to write the extension names in ebtables,
  from Chen Gang.

* Use already existing per-cpu notrack object from xt_CT,
  from Eric Dumazet.

* Save explicit socket lookup in xt_socket now that we have
  early demux, also from Eric Dumazet.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-06 01:03:06 -07:00
Peter Zijlstra
29eb77825c arch, mm: Remove tlb_fast_mode()
Since the introduction of preemptible mmu_gather TLB fast mode has been
broken. TLB fast mode relies on there being absolutely no concurrency;
it frees pages first and invalidates TLBs later.

However now we can get concurrency and stuff goes *bang*.

This patch removes all tlb_fast_mode() code; it was found the better
option vs trying to patch the hole by entangling tlb invalidation with
the scheduler.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Tony Luck <tony.luck@intel.com>
Reported-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-06-06 10:07:26 +09:00
David S. Miller
6bc19fb82d Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Merge 'net' bug fixes into 'net-next' as we have patches
that will build on top of them.

This merge commit includes a change from Emil Goode
(emilgoode@gmail.com) that fixes a warning that would
have been introduced by this merge.  Specifically it
fixes the pingv6_ops method ipv6_chk_addr() to add a
"const" to the "struct net_device *dev" argument and
likewise update the dummy_ipv6_chk_addr() declaration.

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-05 16:37:30 -07:00
Linus Torvalds
4d3797d7e1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Fix timeouts with direct mode authentication in mac80211, from
    Stanislaw Gruszka.

 2) Aggregation sessions can deadlock in ath9k, from Felix Fietkau.

 3) Netfilter's xt_addrtype doesn't work with ipv6 due to route lookups
    creating undesirable cache entries, from Florian Westphal.

 4) Fix netfilter's ipt_ULOG from generating non-NULL terminated
    strings.

 5) Fix netdev transmit queue crashes in mac80211, from Johannes Berg.

 6) Fix copy and paste error in 802.11 stack that broke reporting of
    64-bit station tx statistics, from Felix Fietkau.

 7) When qlge_probe fails, it leaks the netdev.  Fix from Wei Yongjun.

 8) SKB control block (where we store the IP options information,
    amongst other things) must be cleared properly otherwise ICMP
    sending can crash for IP tunnels.  Fix from Eric Dumazet.

 9) Verification of Energy Efficient Ether support was coded wrongly,
    the test was inversed.  Fix from Giuseppe CAVALLARO.

10) TCP handles redirects improperly because the wrong flow key is used
    for the route lookup.  From Michal Kubecek.

11) Don't interpret MSG_CMSG_COMPAT from userspace, fix from Andy
    Lutomirski.

12) The new AF_VSOCK was missing from the lockdep string table, fix from
    Federico Vaga.

13) be2net doesn't handle checksumming of IP fragments properly, from
    Somnath Kotur.

14) Fix several bugs in the device address list code that lead to
    crashes and other misbehaviors.  From Jay Vosburgh.

15) Fix ipv6 segmentation handling of fragmented GRE tunnel traffic,
    from Pravin B Shalr.

16) Fix usage of stale policies in IPSEC layer, from Paul Moore.

17) Fix team driver dump of ports when there are a large number of them,
    from Jiri Pirko.

18) Fix softlockups in UDP ipv4 socket lookup causes by and error in the
    hlist_nulls_for_each_entry_rcu() macro.  From Eric Dumazet.

19) Fix several regressions added by the high rate accuracy changes to
    the htb packet scheduler.  From Eric Dumazet.

20) Fix DMA'ing onto the stack in esd_usb2 and peak_usb CAN drivers,
    from Olivier Sobrie and Marc Kleine-Budde.

21) Fix unremovable network devices due to missing route pointer
    installation in the per-device ipv6 address list entries.  From Gao
    feng.

22) Apply the tg3 5719 DMA workaround on 5720 chips as well, otherwise
    we get stalls.  From Nithin Sujir.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (68 commits)
  net_sched: htb: do not mix 1ns and 64ns time units
  net: fix sk_buff head without data area
  tg3: Add read dma workaround for 5720
  net: ethernet: xilinx_emaclite: set protocol selector bits when writing ANAR
  bnx2x: Fix bridged GSO for 57710/57711 chips
  net: fec: add fallback to random MAC address
  bnx2x: fix TCP offload for tunneling ipv4 over ipv6
  ipv6: assign rt6_info to inet6_ifaddr in init_loopback
  net/mlx4_core: Keep VF assigned MAC in the PF admin table
  net/mlx4_en: Handle unassigned VF MAC address correctly
  net/mlx4_core: Return -EPROBE_DEFER when a VF is probed before PF is sufficiently initialized
  net/mlx4_en: Fix adaptive moderation cq update
  net: can: peak_usb: Do not do dma on the stack
  net: can: esd_usb2: Do not do dma on the stack
  net: can: kvaser_usb: fix reception on "USBcan Pro" and "USBcan R" type hardware.
  net_sched: restore "overhead xxx" handling
  net: force a reload of first item in hlist_nulls_for_each_entry_rcu
  hyperv: Fix vlan_proto setting in netvsc_recv_callback()
  team: fix port list dump for big number of ports
  list: introduce list_first_entry_or_null
  ...
2013-06-05 19:19:04 +09:00
Joe Perches
f145b67f24 transp_v6.h: style neatening
Use a more current code style.

Remove extern from function prototypes.
Align function arguments and reflow to 80 cols.
Use network comment styles.

Signed-off-by: Joe Perches <joe@perches.com>
cc: Lorenzo Colitti <lorenzo@google.com>,
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-04 16:43:42 -07:00
Jean Sacren
0e9649c143 net: do not manually initialize enumerators
Clean up unnecessary initialization of enumerators as the compiler takes
care of that.

Signed-off-by: Jean Sacren <sakiwit@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-04 15:17:38 -07:00
Lorenzo Colitti
d862e54614 net: ipv6: Implement /proc/net/icmp6.
The format is based on /proc/net/icmp and /proc/net/{udp,raw}6.

Compiles and displays reasonable results with CONFIG_IPV6={n,m,y}
Couldn't figure out how to test without CONFIG_PROC_FS enabled.

Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-04 12:56:14 -07:00
Lorenzo Colitti
8cc785f6f4 net: ipv4: make the ping /proc code AF-independent
Introduce a ping_seq_afinfo structure (similar to its UDP
equivalent) and use it to make some of the ping /proc functions
address-family independent. Rename the remaining ping /proc
functions from ping_* to ping_v4_*.

Compiles and displays reasonable results with CONFIG_IPV6={n,m,y}

Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-04 12:56:14 -07:00
Lorenzo Colitti
17ef66afc0 net: ipv6: Unify {raw,udp}6_sock_seq_show.
udp6_sock_seq_show and raw6_sock_seq_show are identical, except
the UDP version displays ports and the raw version displays the
protocol. Refactor most of the code in these two functions into
a new common ip6_dgram_sock_seq_show function, in preparation
for using it to display ICMPv6 sockets as well.

Also reduce the indentation in parts of include/net/transp_v6.h
to improve readability.

Compiles and displays reasonable results with CONFIG_IPV6={n,m,y}

Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-04 12:56:14 -07:00
Lorenzo Colitti
75698b17ac Clean up indentation in net/ipv6/transp_v6.h
Reduce the indentation of most of the functions and make it a
bit more consistent. This allows longer function and arg names
to be consistently indented without wrapping.

Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-04 12:56:14 -07:00
Linus Torvalds
286e050bc0 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Martin Schwidefsky:
 "Recent bug fixes, one of them touches a common code file.

  It adds two #ifndef/#endif pairs to asm-generic/io.h to be able to
  override xlate_dev_kmem_ptr and xlate_dev_mem_ptr."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/pgtable: Fix gmap notifier address
  s390/dasd: fix handling of gone paths
  s390/pgtable: Fix check for pgste/storage key handling
  arch: s390: appldata: using strncpy() and strnlen() instead of sprintf()
  s390/smp: lost IPIs on cpu hotplug
  kernel: Fix s390 absolute memory access for /dev/mem
  s390/dma: do not call debug_dma after free
2013-06-03 18:04:07 +09:00
Linus Torvalds
7d80fea426 Merge branch 'for-3.10-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fixes from Tejun Heo:

 - Fix for yet another xattr bug which may lead to NULL deref.

 - A subtle bug in for_each_descendant_pre().  This bug requires quite
   specific conditions to trigger and isn't too likely to actually
   happen in the wild, but maybe that just makes it that much more
   nastier.

 - A warning message added for silly cgroup re-mount (not -o remount,
   but unmount followed by mount) behavior.

* 'for-3.10-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup: warn about mismatching options of a new mount of an existing hierarchy
  cgroup: fix a subtle bug in descendant pre-order walk
  cgroup: initialize xattr before calling d_instantiate()
2013-06-03 17:57:16 +09:00
Timo Teräs
5aad1de5ea ipv4: use separate genid for next hop exceptions
commit 13d82bf5 (ipv4: Fix flushing of cached routing informations)
added the support to flush learned pmtu information.

However, using rt_genid is quite heavy as it is bumped on route
add/change and multicast events amongst other places. These can
happen quite often, especially if using dynamic routing protocols.

While this is ok with routes (as they are just recreated locally),
the pmtu information is learned from remote systems and the icmp
notification can come with long delays. It is worthy to have separate
genid to avoid excessive pmtu resets.

Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Timo Teräs <timo.teras@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-03 00:07:43 -07:00
Eric Dumazet
01cb71d2d4 net_sched: restore "overhead xxx" handling
commit 56b765b79 ("htb: improved accuracy at high rates")
broke the "overhead xxx" handling, as well as the "linklayer atm"
attribute.

tc class add ... htb rate X ceil Y linklayer atm overhead 10

This patch restores the "overhead xxx" handling, for htb, tbf
and act_police

The "linklayer atm" thing needs a separate fix.

Reported-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Vimalkumar <j.vimal@gmail.com>
Cc: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-02 22:22:35 -07:00
Eric Dumazet
c87a124a5d net: force a reload of first item in hlist_nulls_for_each_entry_rcu
Roman Gushchin discovered that udp4_lib_lookup2() was not reloading
first item in the rcu protected list, in case the loop was restarted.

This produced soft lockups as in https://lkml.org/lkml/2013/4/16/37

rcu_dereference(X)/ACCESS_ONCE(X) seem to not work as intended if X is
ptr->field :

In some cases, gcc caches the value or ptr->field in a register.

Use a barrier() to disallow such caching, as documented in
Documentation/atomic_ops.txt line 114

Thanks a lot to Roman for providing analysis and numerous patches.

Diagnosed-by: Roman Gushchin <klamm@yandex-team.ru>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Boris Zhmurov <zhmurov@yandex-team.ru>
Signed-off-by: Roman Gushchin <klamm@yandex-team.ru>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-02 20:53:59 -07:00
Linus Torvalds
008bd2de94 Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull scsi target fixes from Nicholas Bellinger:
 "The highlights include:

   - Re-instate sess->wait_list in target_wait_for_sess_cmds() for
     active I/O shutdown handling in fabrics using se_cmd->cmd_kref
   - Make ib_srpt call target_sess_cmd_list_set_waiting() during session
     shutdown
   - Fix FILEIO off-by-one READ_CAPACITY bug for !S_ISBLK export
   - Fix iscsi-target login error heap buffer overflow (Kees)
   - Fix iscsi-target active I/O shutdown handling regression in
     v3.10-rc1

  A big thanks to Kees Cook for fixing a long standing login error
  buffer overflow bug.

  All patches are CC'ed to stable with the exception of the v3.10-rc1
  specific regression + other minor target cleanup."

* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
  iscsi-target: Fix iscsit_free_cmd() se_cmd->cmd_kref shutdown handling
  target: Propigate up ->cmd_kref put return via transport_generic_free_cmd
  iscsi-target: fix heap buffer overflow on error
  target/file: Fix off-by-one READ_CAPACITY bug for !S_ISBLK export
  ib_srpt: Call target_sess_cmd_list_set_waiting during shutdown_session
  target: Re-instate sess_wait_list for target_wait_for_sess_cmds
  target: Remove unused wait_for_tasks bit in target_wait_for_sess_cmds
2013-06-01 20:05:20 +09:00
Linus Torvalds
c361cb59ac fbdev: fixes for 3.10-rc4
This contains some small fixes
 
 Atmel LCDC: fix blank the backlight on remove
 ps3fb: fix compile warning
 OMAPDSS: Fix crash with DT boot
 
 Signed-off-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIbBAABAgAGBQJRqPk4AAoJEOrjwV5ZMRf2N10P+NXJl0gllH/aVmLv46/nkOKi
 lO5PPxmswJzxYJZe/fl6nodbDIUlPisjl5y5KSBNzsXoKMyCxGvWdUvNt3DgcDqs
 n+VunNAcbh9/8kcUe9cIRbFYIX15XtxKCJbXUoXqwLnnfFwXbXfy/AnnlkdIOeNO
 0qyK+M0bJ3Y5s4dWYrfhU+yPxek2908agEamhAROn0rgzZD54h3mUrNPp15Hmc0a
 JFrUrKzrjX8TY804bw873Y7RmLpERXU8yXaZdW01MxDE2r2RfJrp90eEzCfjdjJD
 OpC4liPO87+vPwtremSzj2b/LojWdbZ+a08s4SZ+798J+Gbg/YCRIm3JrFXJsHvJ
 kMGMVEciew+gVnGYdrs2CAeiN1xPZpF1NRv8tMLFhTieFHKw7NupIHG5o4MX7/9n
 Bv64+SYt6ox3U1dZlRdQOzjlt3yo1hT8qyQBvKgz0icEnW/ZV0d4oYFfDG7WyHgi
 4pxOw8r96KtxpS/ZN//cu4oygGRf1EL3hHiaguJI4V4fSiEwaEL6uUgK9RFINLnJ
 5RhyoQeRATKXlgadzbOHFssJsECSTtszgsWmY/xe2F9uDdDSucTo9bRo+4re7QG0
 6l986FHJou22nolAGisjpLei11ESlba4VvCv8H6ej3MaKGRjJEYQ4s/uguJ2pMti
 wuhoEB0/UjOs5ZzCI+0=
 =tjLq
 -----END PGP SIGNATURE-----

Merge tag 'fbdev-for-3.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/plagnioj/linux-fbdev

Pull fbdev fixes from Jean-Christophe PLAGNIOL-VILLARD:
 "This contains some small fixes

   - Atmel LCDC: fix blank the backlight on remove
   - ps3fb: fix compile warning
   - OMAPDSS: Fix crash with DT boot"

* tag 'fbdev-for-3.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/plagnioj/linux-fbdev:
  atmel_lcdfb: blank the backlight on remove
  trivial: atmel_lcdfb: add missing error message
  OMAPDSS: Fix crash with DT boot
  fbdev/ps3fb: fix compile warning
2013-06-01 19:53:41 +09:00
Jiri Pirko
6d7581e62f list: introduce list_first_entry_or_null
non-rcu variant of list_first_or_null_rcu

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-31 17:31:52 -07:00
Paul Moore
e4c1721642 xfrm: force a garbage collection after deleting a policy
In some cases after deleting a policy from the SPD the policy would
remain in the dst/flow/route cache for an extended period of time
which caused problems for SELinux as its dynamic network access
controls key off of the number of XFRM policy and state entries.
This patch corrects this problem by forcing a XFRM garbage collection
whenever a policy is sucessfully removed.

Reported-by: Ondrej Moris <omoris@redhat.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-31 17:30:07 -07:00
Florian Fainelli
2cc70ba4cf phy: add reverse MII PHY connection type
The PHY library currently does not know about the the reverse MII
connection type. Add it to the list of supported PHY modes and update
of_get_phy_mode() to support it and look for the string "rev-mii".

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-31 17:19:05 -07:00
Nicolas Dichtel
bf3d6a8f79 iptunnel: specify protocol outside IP header
Before this patch, ip_tunnel_xmit() was using the field protocol from the IP
header passed into argument.
There is no functional change, this patch prepares the support of IPv4 over
IPv4 for module sit.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-31 17:19:05 -07:00
Pravin B Shelar
1e2bd517c1 udp6: Fix udp fragmentation for tunnel traffic.
udp6 over GRE tunnel does not work after to GRE tso changes. GRE
tso handler passes inner packet but keeps track of outer header
start in SKB_GSO_CB(skb)->mac_offset.  udp6 fragment need to
take care of outer header, which start at the mac_offset, while
adding fragment header.
This bug is introduced by commit 68c3316311 (GRE: Add TCP
segmentation offload for GRE).

Reported-by: Dmitry Kravkov <dkravkov@gmail.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Tested-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-31 17:06:07 -07:00
Cong Wang
35d0461061 net: clean up skb headers code
commit 1a37e412a0 (net: Use 16bits for *_headers
fields of struct skbuff) converts skb->*_header to u16,
some #if NET_SKBUFF_DATA_USES_OFFSET are now useless,
and to be safe, we could just use "X = (typeof(X)) ~0U;"
as suggested by David.

Cc: David S. Miller <davem@davemloft.net>
Cc: Simon Horman <horms@verge.net.au>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-31 17:02:47 -07:00
Linus Torvalds
977b55cf98 Can't call pci_get_domain_bus_and_slot() from interupt context
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJRp5IvAAoJEKurIx+X31iBGqMP/2aXdk0hBXKM76uCyYt1cn/X
 MibhCgX17L65BOvL19CB0Ylj+uAlgAw/0qzwMqP8eKgt1Unx0vzY5Pf0B1nOsHeF
 YQU2RAhu5Xw23sTSf3b9tRADki4n14pzhwfTzOSqY/a9JxeatoFElCyMjcXL5UaY
 hL96Ke1pvWZrAI3VT9tneH4AV0qfMajd7xSdCyS22+IdzcxEw/xln/SCXZieeFxE
 WQpNGGB/pSwJYnTU8nNJQdFLggO19Sa0v+rOZJzQ08BuKL1nJ6rlEZejk6T668So
 RJNlGTBTHdwMiH7HctUb9iYcTfyhGlZ+kxdq7puRtgKABbjMnzlSWpbJzIv/LSsm
 M0atocT9Se82V9g6g4lizEeBuwUjfv94GXrWlAtVgpi8KKpF0LOwMeeFvciFp96k
 p6hILjqiYVF+G9lP0Vkh1DPp9a2MfROLX0+GKZtuLjloZ0WJbqdOY8wHvFApsltL
 B5szN4TLLutQfHkfqH27MIDtFQmJe+tieMJ0EyfAJan4eigrlZu0iqw4tJkZWiHj
 yWIIgwaxFvZA639yWObVfk52nesjNuWNtVfKM+OjcUsR0O7UO9ZhcoI9Oew7lxqD
 XoEv4FgtPBFDz7Q659auUMrHx4cZ/STkxikUrLwQmR0QeR9nVW9OnGBGTsR8EKm+
 tymC0x7AYmnDJUPoZHqM
 =ToBG
 -----END PGP SIGNATURE-----

Merge tag 'please-pull-aertracefix' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras

Pull aer error logging fix from Tony Luck:
 "Can't call pci_get_domain_bus_and_slot() from interupt context"

* tag 'please-pull-aertracefix' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
  aerdrv: Move cper_print_aer() call out of interrupt context
2013-06-01 06:47:04 +09:00
Nicholas Bellinger
d5ddad4168 target: Propigate up ->cmd_kref put return via transport_generic_free_cmd
Go ahead and propigate up the ->cmd_kref put return value from
target_put_sess_cmd() -> transport_release_cmd() -> transport_put_cmd()
-> transport_generic_free_cmd().

This is useful for certain fabrics when determining the active I/O
shutdown case with SCF_ACK_KREF where a final target_put_sess_cmd()
is still required by the caller.

Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2013-05-31 01:21:23 -07:00
Michal Simek
10e24caa98 phy: Add Marvell 88E1510 phy ID
Add support for this new phy ID.

Signed-off-by: Rick Hoover <RHoover@digilentinc.com>
Signed-off-by: Steven Wang <steven.wang@digilentinc.com>
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-31 00:48:22 -07:00
Michal Simek
3da09a5154 phy: Add Marvell 88E1116R phy ID
This phy is on Xilinx ZC702 zynq development board.

Signed-off-by: Anirudha Sarangi <anirudh@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-31 00:48:22 -07:00
Sebastian Hesselbarth
5f292354e7 net: mv643xx_eth: add phy_node to platform_data struct
This adds a struct device_node pointer for a phy passed by phandle
to mv643xx_eth node.

Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-30 17:54:03 -07:00
David S. Miller
73ce00d4d6 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
The following patchset contains Netfilter/IPVS fixes for 3.10-rc3,
they are:

* fix xt_addrtype with IPv6, from Florian Westphal. This required
  a new hook for IPv6 functions in the netfilter core to avoid
  hard dependencies with the ipv6 subsystem when this match is
  only used for IPv4.

* fix connection reuse case in IPVS. Currently, if an reused
  connection are directed to the same server. If that server is
  down, those connection would fail. Therefore, clear the
  connection and choose a new server among the available ones.

* fix possible non-nul terminated string sent to user-space if
  ipt_ULOG is used as the default netfilter logging stub, from
  Chen Gang.

* fix mark logging of IPv6 packets in xt_LOG, from Michal Kubecek.
  This bug has been there since 2.6.26.

* Fix breakage ip_vs_sh due to incorrect structure layout for
  RCU, from Jan Beulich.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-30 16:38:38 -07:00
Linus Torvalds
3655b22de0 Fixes:
- Use proper error paths
  - Clean up APIC IPI usage (incorrect arguments)
  - Delay XenBus frontend resume is backend (xenstored) is not running
  - Fix build error with various combinations of CONFIG_
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.13 (GNU/Linux)
 
 iQEcBAABAgAGBQJRp59pAAoJEFjIrFwIi8fJRIYIAKvRh2Dp/AB44ZN97MW/QhEN
 NUvrSTYr2HlqcUW7bv0ScrMLb0LlFeo+9s/bo0KI2+2F+zK822WPC+2KEZmzQIVs
 q261dNsA3/HoyBDOLwWjatjsSus+njBOEgDIwARPwhkoon4fRXBnRJVMy+0bZC3I
 fpd1nlUy0J7jW0QLO5ueKqd5ZN0Mkwn2H4+D8TOPVYHCnk3mT2W+qLCEJmkMxOuZ
 iFYy95K1ky5r0leUUwCTUIGLmgftoh0Qo/RweXSmzuLiZrY+5ilike3gxQSiAjsM
 lIjq+gKXNJJGz4M6wbOTfDzb/WQnKD+2PqlsbulrTD7E6RD6wIsqG/zvc1RqHqw=
 =9gi8
 -----END PGP SIGNATURE-----

Merge tag 'stable/for-linus-3.10-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen

Pull Xen fixes from Konrad Rzeszutek Wilk:
 - Use proper error paths
 - Clean up APIC IPI usage (incorrect arguments)
 - Delay XenBus frontend resume is backend (xenstored) is not running
 - Fix build error with various combinations of CONFIG_

* tag 'stable/for-linus-3.10-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xenbus_client.c: correct exit path for xenbus_map_ring_valloc_hvm
  xen-pciback: more uses of cached MSI-X capability offset
  xen: Clean up apic ipi interface
  xenbus: save xenstore local status for later use
  xenbus: delay xenbus frontend resume if xenstored is not running
  xmem/tmem: fix 'undefined variable' build error.
2013-05-31 06:01:18 +09:00
Lance Ortiz
37448adfc7 aerdrv: Move cper_print_aer() call out of interrupt context
The following warning was seen on 3.9 when a corrected PCIe error was being
handled by the AER subsystem.

WARNING: at .../drivers/pci/search.c:214 pci_get_dev_by_id+0x8a/0x90()

This occurred because a call to pci_get_domain_bus_and_slot() was added to
cper_print_pcie() to setup for the call to cper_print_aer().  The warning
showed up because cper_print_pcie() is called in an interrupt context and
pci_get* functions are not supposed to be called in that context.

The solution is to move the cper_print_aer() call out of the interrupt
context and into aer_recover_work_func() to avoid any warnings when calling
pci_get* functions.

Signed-off-by: Lance Ortiz <lance.ortiz@hp.com>
Acked-by: Borislav Petkov <bp@suse.de>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2013-05-30 10:51:20 -07:00