Commit graph

44959 commits

Author SHA1 Message Date
Pablo Neira
f719148346 netfilter: add hook list to nf_hook_state
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-14 01:10:05 -04:00
Pablo Neira
87d5c18ce1 netfilter: cleanup struct nf_hook_ops indentation
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-14 01:10:05 -04:00
Pablo Neira
f0b5e8a42f net: kill useless net_*_ingress_queue() definitions when NET_CLS_ACT is unset
This fixes 4577139b2d ("net: use jump label patching for ingress qdisc in
__netif_receive_skb_core").

The only client of this is sch_ingress and it depends on NET_CLS_ACT. So
there is no way these definition can be of any help.

Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13 15:44:28 -04:00
Jiri Pirko
06635a35d1 flow_dissect: use programable dissector in skb_flow_dissect and friends
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13 15:19:47 -04:00
Jiri Pirko
5605c76240 net: move __skb_tx_hash to dev.c
__skb_tx_hash function has no relation to flow_dissect so just move it
to dev.c

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13 15:19:46 -04:00
Jiri Pirko
9c684b5083 net: move __skb_get_hash function declaration to flow_dissector.h
Since the definition of the function is in flow_dissector.c, it makes
sense to have the declaration in flow_dissector.h

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13 15:19:46 -04:00
Jiri Pirko
10b89ee43e net: move *skb_get_poff declarations into correct header
Since these functions are defined in flow_dissector.c, move header
declarations from skbuff.h into flow_dissector.h

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13 15:19:45 -04:00
Jiri Pirko
1bd758eb1c net: change name of flow_dissector header to match the .c file name
add couple of empty lines on the way.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13 15:19:45 -04:00
David S. Miller
b04096ff33 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Four minor merge conflicts:

1) qca_spi.c renamed the local variable used for the SPI device
   from spi_device to spi, meanwhile the spi_set_drvdata() call
   got moved further up in the probe function.

2) Two changes were both adding new members to codel params
   structure, and thus we had overlapping changes to the
   initializer function.

3) 'net' was making a fix to sk_release_kernel() which is
   completely removed in 'net-next'.

4) In net_namespace.c, the rtnl_net_fill() call for GET operations
   had the command value fixed, meanwhile 'net-next' adjusted the
   argument signature a bit.

This also matches example merge resolutions posted by Stephen
Rothwell over the past two days.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13 14:31:43 -04:00
110bc76729 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Handle max TX power properly wrt VIFs and the MAC in iwlwifi, from
    Avri Altman.

 2) Use the correct FW API for scan completions in iwlwifi, from Avraham
    Stern.

 3) FW monitor in iwlwifi accidently uses unmapped memory, fix from Liad
    Kaufman.

 4) rhashtable conversion of mac80211 station table was buggy, the
    virtual interface was not taken into account.  Fix from Johannes
    Berg.

 5) Fix deadlock in rtlwifi by not using a zero timeout for
    usb_control_msg(), from Larry Finger.

 6) Update reordering state before calculating loss detection, from
    Yuchung Cheng.

 7) Fix off by one in bluetooth firmward parsing, from Dan Carpenter.

 8) Fix extended frame handling in xiling_can driver, from Jeppe
    Ledet-Pedersen.

 9) Fix CODEL packet scheduler behavior in the presence of TSO packets,
    from Eric Dumazet.

10) Fix NAPI budget testing in fm10k driver, from Alexander Duyck.

11) macvlan needs to propagate promisc settings down the the lower
    device, from Vlad Yasevich.

12) igb driver can oops when changing number of rings, from Toshiaki
    Makita.

13) Source specific default routes not handled properly in ipv6, from
    Markus Stenberg.

14) Use after free in tc_ctl_tfilter(), from WANG Cong.

15) Use softirq spinlocking in netxen driver, from Tony Camuso.

16) Two ARM bpf JIT fixes from Nicolas Schichan.

17) Handle MSG_DONTWAIT properly in ring based AF_PACKET sends, from
    Mathias Kretschmer.

18) Fix x86 bpf JIT implementation of FROM_{BE16,LE16,LE32}, from Alexei
    Starovoitov.

19) ll_temac driver DMA maps TX packet header with incorrect length, fix
    from Michal Simek.

20) We removed pm_qos bits from netdevice.h, but some indirect
    references remained.  Kill them.  From David Ahern.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (90 commits)
  net: Remove remaining remnants of pm_qos from netdevice.h
  e1000e: Add pm_qos header
  net: phy: micrel: Fix regression in kszphy_probe
  net: ll_temac: Fix DMA map size bug
  x86: bpf_jit: fix FROM_BE16 and FROM_LE16/32 instructions
  netns: return RTM_NEWNSID instead of RTM_GETNSID on a get
  Update be2net maintainers' email addresses
  net_sched: gred: use correct backlog value in WRED mode
  pppoe: drop pppoe device in pppoe_unbind_sock_work
  net: qca_spi: Fix possible race during probe
  net: mdio-gpio: Allow for unspecified bus id
  af_packet / TX_RING not fully non-blocking (w/ MSG_DONTWAIT).
  bnx2x: limit fw delay in kdump to 5s after boot
  ARM: net: delegate filter to kernel interpreter when imm_offset() return value can't fit into 12bits.
  ARM: net fix emit_udiv() for BPF_ALU | BPF_DIV | BPF_K intruction.
  mpls: Change reserved label names to be consistent with netbsd
  usbnet: avoid integer overflow in start_xmit
  netxen_nic: use spin_[un]lock_bh around tx_clean_lock (2)
  net: xgene_enet: Set hardware dependency
  net: amd-xgbe: Add hardware dependency
  ...
2015-05-12 21:10:38 -07:00
David Ahern
01d460dd70 net: Remove remaining remnants of pm_qos from netdevice.h
Commit e2c6544829 removed pm_qos from struct net_device but left the
comment and header file. Remove those.

Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 23:22:03 -04:00
Michael Holzheu
cffc642d93 test_bpf: add 173 new testcases for eBPF
add an exhaustive set of eBPF tests bringing total to:
test_bpf: Summary: 233 PASSED, 0 FAILED, [0/226 JIT'ed]

Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 23:15:25 -04:00
Denys Vlasenko
a2029240e5 net: deinline netif_tx_stop_all_queues(), remove WARN_ON in netif_tx_stop_queue()
These functions compile to 60 bytes of machine code each.
With this .config: http://busybox.net/~vda/kernel_config
there are 617 calls of netif_tx_stop_queue()
and 49 calls of netif_tx_stop_all_queues() in vmlinux.

To fix this, remove WARN_ON in netif_tx_stop_queue()
as suggested by davem, and deinline netif_tx_stop_all_queues().

Change in code size is about 20k:

   text      data      bss       dec     hex filename
82426986 22255416 20627456 125309858 77813a2 vmlinux.before
82406248 22255416 20627456 125289120 777c2a0 vmlinux

gcc-4.7.2 still creates deinlined version of netif_tx_stop_queue
sometimes:

$ nm --size-sort vmlinux | grep netif_tx_stop_queue | wc -l
190

ffffffff81b558a8 <netif_tx_stop_queue>:
ffffffff81b558a8:       55                      push   %rbp
ffffffff81b558a9:       48 89 e5                mov    %rsp,%rbp
ffffffff81b558ac:       f0 80 8f e0 01 00 00    lock orb $0x1,0x1e0(%rdi)
ffffffff81b558b3:       01
ffffffff81b558b4:       5d                      pop    %rbp
ffffffff81b558b5:       c3                      retq

This needs additional fixing.

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
CC: Alexei Starovoitov <alexei.starovoitov@gmail.com>
CC: Alexander Duyck <alexander.duyck@gmail.com>
CC: Joe Perches <joe@perches.com>
CC: David S. Miller <davem@davemloft.net>
CC: Jiri Pirko <jpirko@redhat.com>
CC: linux-kernel@vger.kernel.org
CC: netdev@vger.kernel.org
CC: netfilter-devel@vger.kernel.org
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 23:05:35 -04:00
Scott Feldman
7889cbee83 switchdev: remove NETIF_F_HW_SWITCH_OFFLOAD feature flag
Roopa said remove the feature flag for this series and she'll work on
bringing it back if needed at a later date.

Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 18:43:55 -04:00
Jiri Pirko
9d47c0a2d9 switchdev: s/swdev_/switchdev_/
Turned out that "switchdev" sticks. So just unify all related terms to use
this prefix.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Acked-by: Andy Gospodarek <gospo@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 18:43:53 -04:00
Alexander Duyck
181edb2bfa net: Add skb_free_frag to replace use of put_page in freeing skb->head
This change adds a function called skb_free_frag which is meant to
compliment the function netdev_alloc_frag.  The general idea is to enable a
more lightweight version of page freeing since we don't actually need all
the overhead of a put_page, and we don't quite fit the model of __free_pages.

Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 10:39:26 -04:00
Alexander Duyck
b63ae8ca09 mm/net: Rename and move page fragment handling from net/ to mm/
This change moves the __alloc_page_frag functionality out of the networking
stack and into the page allocation portion of mm.  The idea it so help make
this maintainable by placing it with other page allocation functions.

Since we are moving it from skbuff.c to page_alloc.c I have also renamed
the basic defines and structure from netdev_alloc_cache to page_frag_cache
to reflect that this is now part of a different kernel subsystem.

I have also added a simple __free_page_frag function which can handle
freeing the frags based on the skb->head pointer.  The model for this is
based off of __free_pages since we don't actually need to deal with all of
the cases that put_page handles.  I incorporated the virt_to_head_page call
and compound_order into the function as it actually allows for a signficant
size reduction by reducing code duplication.

Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 10:39:26 -04:00
Alexander Duyck
0e39250845 net: Store virtual address instead of page in netdev_alloc_cache
This change makes it so that we store the virtual address of the page
in the netdev_alloc_cache instead of the page pointer.  The idea behind
this is to avoid multiple calls to page_address since the virtual address
is required for every access, but the page pointer is only needed at
allocation or reset of the page.

While I was at it I also reordered the netdev_alloc_cache structure a bit
so that the size is always 16 bytes by dropping size in the case where
PAGE_SIZE is greater than or equal to 32KB.

Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 10:39:26 -04:00
b3e5838ac0 Merge branch 'for-4.1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata
Pull libata fixes from Tejun Heo:
 "Rather big for fixes pull.

   - SCC controllers never lived to see the light of the day.  Both
     libata and ide drivers removed.

   - In some configurations, link power management policy changes
     sometimes cause delayed spurious PHY events which can develop into
     noticeable failures.  This has been reported several times over the
     years.  Gabriele's patches suppress PHY events for a while after
     LPM policy changes which should help most of these failures without
     causing too much problem for hotplug use cases.

   - A few controller specific fixes"

[ Hmm.  I don't think removing SSC support is really a "fix", but hey, it
  removes a lot of lines of code.  Which I like.  So ...  good riddance ]

* 'for-4.1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
  ahci: avoton port-disable reset-quirk
  ata: select DW_DMAC in case of SATA_DWC
  libata: Blacklist queued TRIM on all Samsung 800-series
  libata: Ignore spurious PHY event on LPM policy change
  libata: Add helper to determine when PHY events should be ignored
  ata: ahci_st: fixup layering violations / drvdata errors
  Remove celleb-only SCC PATA drivers
2015-05-11 10:54:20 -07:00
Daniel Borkmann
4cda01e86f net: sched: fix typo in net_device ifdef
This should have been #ifdef not #if.

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Fixes: d2788d3488 ("net: sched: further simplify handle_ing")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-11 13:37:42 -04:00
Daniel Borkmann
d2788d3488 net: sched: further simplify handle_ing
Ingress qdisc has no other purpose than calling into tc_classify()
that executes attached classifier(s) and action(s).

It has a 1:1 relationship to dev->ingress_queue. After having commit
087c1a601a ("net: sched: run ingress qdisc without locks") removed
the central ingress lock, one major contention point is gone.

The extra indirection layers however, are not necessary for calling
into ingress qdisc. pktgen calling locally into netif_receive_skb()
with a dummy u32, single CPU result on a Supermicro X10SLM-F, Xeon
E3-1240: before ~21,1 Mpps, after patch ~22,9 Mpps.

We can redirect the private classifier list to the netdev directly,
without changing any classifier API bits (!) and execute on that from
handle_ing() side. The __QDISC_STATE_DEACTIVATE test can be removed,
ingress qdisc doesn't have a queue and thus dev_deactivate_queue()
is also not applicable, ingress_cl_list provides similar behaviour.
In other words, ingress qdisc acts like TCQ_F_BUILTIN qdisc.

One next possible step is the removal of the dev's ingress (dummy)
netdev_queue, and to only have the list member in the netdevice
itself.

Note, the filter chain is RCU protected and individual filter elements
are being kfree'd by sched subsystem after RCU grace period. RCU read
lock is being held by __netif_receive_skb_core().

Joint work with Alexei Starovoitov.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-11 11:10:35 -04:00
Eric W. Biederman
11aa9c28b4 net: Pass kern from net_proto_family.create to sk_alloc
In preparation for changing how struct net is refcounted
on kernel sockets pass the knowledge that we are creating
a kernel socket from sock_create_kern through to sk_alloc.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-11 10:50:17 -04:00
Eric W. Biederman
eeb1bd5c40 net: Add a struct net parameter to sock_create_kern
This is long overdue, and is part of cleaning up how we allocate kernel
sockets that don't reference count struct net.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-11 10:50:17 -04:00
Eric W. Biederman
140e807da1 tun: Utilize the normal socket network namespace refcounting.
There is no need for tun to do the weird network namespace refcounting.
The existing network namespace refcounting in tfile has almost exactly
the same lifetime.  So rewrite the code to use the struct sock network
namespace refcounting and remove the unnecessary hand rolled network
namespace refcounting and the unncesary tfile->net.

This change allows the tun code to directly call sock_put bypassing
sock_release and making SOCK_EXTERNALLY_ALLOCATED unnecessary.

Remove the now unncessary tun_release so that if anything tries to use
the sock_release code path the kernel will oops, and let us know about
the bug.

The macvtap code already uses it's internal socket this way.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-11 10:50:16 -04:00
Nicolas Dichtel
59324cf35a netlink: allow to listen "all" netns
More accurately, listen all netns that have a nsid assigned into the netns
where the netlink socket is opened.
For this purpose, a netlink socket option is added:
NETLINK_LISTEN_ALL_NSID. When this option is set on a netlink socket, this
socket will receive netlink notifications from all netns that have a nsid
assigned into the netns where the socket has been opened. The nsid is sent
to userland via an anscillary data.

With this patch, a daemon needs only one socket to listen many netns. This
is useful when the number of netns is high.

Because 0 is a valid value for a nsid, the field nsid_is_set indicates if
the field nsid is valid or not. skb->cb is initialized to 0 on skb
allocation, thus we are sure that we will never send a nsid 0 by error to
the userland.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-09 22:15:31 -04:00
9d88f22a81 Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
 "Two patches from the irq departement:

   - a simple fix to make dummy_irq_chip usable for wakeup scenarios

   - removal of the gic arch_extn hackery.  Now that all users are
     converted we really want to get rid of the interface so people wont
     come up with new use cases"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip: gic: Drop support for gic_arch_extn
  genirq: Set IRQCHIP_SKIP_SET_WAKE flag for dummy_irq_chip
2015-05-09 14:59:05 -07:00
Daniel Borkmann
ac67eb2c53 seccomp, filter: add and use bpf_prog_create_from_user from seccomp
Seccomp has always been a special candidate when it comes to preparation
of its filters in seccomp_prepare_filter(). Due to the extra checks and
filter rewrite it partially duplicates code and has BPF internals exposed.

This patch adds a generic API inside the BPF code code that seccomp can use
and thus keep it's filter preparation code minimal and better maintainable.
The other side-effect is that now classic JITs can add seccomp support as
well by only providing a BPF_LDX | BPF_W | BPF_ABS translation.

Tested with seccomp and BPF test suites.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Nicolas Schichan <nschichan@freebox.fr>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Kees Cook <keescook@chromium.org>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-09 17:35:05 -04:00
Nicolas Schichan
d9e12f42e5 seccomp: simplify seccomp_prepare_filter and reuse bpf_prepare_filter
Remove the calls to bpf_check_classic(), bpf_convert_filter() and
bpf_migrate_runtime() and let bpf_prepare_filter() take care of that
instead.

seccomp_check_filter() is passed to bpf_prepare_filter() so that it
gets called from there, after bpf_check_classic().

We can now remove exposure of two internal classic BPF functions
previously used by seccomp. The export of bpf_check_classic() symbol,
previously known as sk_chk_filter(), was there since pre git times,
and no in-tree module was using it, therefore remove it.

Joint work with Daniel Borkmann.

Signed-off-by: Nicolas Schichan <nschichan@freebox.fr>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Kees Cook <keescook@chromium.org>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-09 17:35:05 -04:00
Nicolas Schichan
4ae92bc77a net: filter: add a callback to allow classic post-verifier transformations
This is in preparation for use by the seccomp code, the rationale is
not to duplicate additional code within the seccomp layer, but instead,
have it abstracted and hidden within the classic BPF API.

As an interim step, this now also makes bpf_prepare_filter() visible
(not as exported symbol though), so that seccomp can reuse that code
path instead of reimplementing it.

Joint work with Daniel Borkmann.

Signed-off-by: Nicolas Schichan <nschichan@freebox.fr>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Kees Cook <keescook@chromium.org>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-09 17:35:05 -04:00
1daac193f2 Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
 "A collection of fixes since the merge window;

   - fix for a double elevator module release, from Chao Yu.  Ancient bug.

   - the splice() MORE flag fix from Christophe Leroy.

   - a fix for NVMe, fixing a patch that went in in the merge window.
     From Keith.

   - two fixes for blk-mq CPU hotplug handling, from Ming Lei.

   - bdi vs blockdev lifetime fix from Neil Brown, fixing and oops in md.

   - two blk-mq fixes from Shaohua, fixing a race on queue stop and a
     bad merge issue with FUA writes.

   - division-by-zero fix for writeback from Tejun.

   - a block bounce page accounting fix, making sure we inc/dec after
     bouncing so that pre/post IO pages match up.  From Wang YanQing"

* 'for-linus' of git://git.kernel.dk/linux-block:
  splice: sendfile() at once fails for big files
  blk-mq: don't lose requests if a stopped queue restarts
  blk-mq: fix FUA request hang
  block: destroy bdi before blockdev is unregistered.
  block:bounce: fix call inc_|dec_zone_page_state on different pages confuse value of NR_BOUNCE
  elevator: fix double release of elevator module
  writeback: use |1 instead of +1 to protect against div by zero
  blk-mq: fix CPU hotplug handling
  blk-mq: fix race between timeout and CPU hotplug
  NVMe: Fix VPD B0 max sectors translation
2015-05-08 19:49:35 -07:00
26b293e854 The newly added ftrace_print_array_seq() function had a bug in it. Luckily,
the only user of it didn't make the 4.1 merge window. But the helper
 function should be fixed before 4.2 when the users start coming in.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJVTBNUAAoJEEjnJuOKh9ld0VQIAJWPLivGbGJyjSqFd1NXLidS
 ytcbM0dquYjvQ94EDxoA+uBm34hk1JbvcI+FgiOihEeyGh7wrhdibEVGT40TzE2I
 XrfTVwPfN5/k2D5MeZzzRkeoTDufc33MgqTURymRQSzkmHf5GttPXxZ/ckO9Hz9A
 XqzXaHcmnauZSmUY12q8rMtbKYP/dN5hUdmR6p44bMgDJehQkmTzJkxbe6t98b+t
 8y3YAcK5HclYITC2lBVHSw5z8e9F/B7UmrNxvNkcV5kqdYg3NnVnA292kSMft5zo
 WRk1nH4eVARq2dmGQ289QpneHqtMx22RU42m/t8M/v0OUANhlPaDb/RHlyDWJF4=
 =4JGY
 -----END PGP SIGNATURE-----

Merge tag 'trace-fixes-v4.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing fix from Steven Rostedt:
 "The newly added ftrace_print_array_seq() function had a bug in it.
  Luckily, the only user of it didn't make the 4.1 merge window.

  But the helper function should be fixed before 4.2 when the users
  start coming in"

* tag 'trace-fixes-v4.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Make ftrace_print_array_seq compute buf_len
2015-05-08 18:22:05 -07:00
Alex Bennée
ac01ce1410 tracing: Make ftrace_print_array_seq compute buf_len
The only caller to this function (__print_array) was getting it wrong by
passing the array length instead of buffer length. As the element size
was already being passed for other reasons it seems reasonable to push
the calculation of buffer length into the function.

Link: http://lkml.kernel.org/r/1430320727-14582-1-git-send-email-alex.bennee@linaro.org

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-05-06 23:03:23 -04:00
d8fce2db72 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "Mostly tooling fixes, but also an uncore PMU driver fix and an uncore
  PMU driver hardware-enablement addition"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf probe: Fix segfault if passed with ''.
  perf report: Fix -T/--threads option to work again
  perf bench numa: Fix immediate meeting of convergence condition
  perf bench numa: Fixes of --quiet argument
  perf bench futex: Fix hung wakeup tasks after requeueing
  perf probe: Fix bug with global variables handling
  perf top: Fix a segfault when kernel map is restricted.
  tools lib traceevent: Fix build failure on 32-bit arch
  perf kmem: Fix compiles on RHEL6/OL6
  tools lib api: Undefine _FORTIFY_SOURCE before setting it
  perf kmem: Consistently use PRIu64 for printing u64 values
  perf trace: Disable events and drain events when forked workload ends
  perf trace: Enable events when doing system wide tracing and starting a workload
  perf/x86/intel/uncore: Move PCI IDs for IMC to uncore driver
  perf/x86/intel/uncore: Add support for Intel Haswell ULT (lower power Mobile Processor) IMC uncore PMUs
  perf/x86/intel: Add cpu_(prepare|starting|dying) for core_pmu
2015-05-06 10:47:25 -07:00
Ryusuke Konishi
d8fd150fe3 nilfs2: fix sanity check of btree level in nilfs_btree_root_broken()
The range check for b-tree level parameter in nilfs_btree_root_broken()
is wrong; it accepts the case of "level == NILFS_BTREE_LEVEL_MAX" even
though the level is limited to values in the range of 0 to
(NILFS_BTREE_LEVEL_MAX - 1).

Since the level parameter is read from storage device and used to index
nilfs_btree_path array whose element count is NILFS_BTREE_LEVEL_MAX, it
can cause memory overrun during btree operations if the boundary value
is set to the level parameter on device.

This fixes the broken sanity check and adds a comment to clarify that
the upper bound NILFS_BTREE_LEVEL_MAX is exclusive.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-05-05 17:10:11 -07:00
Guenter Roeck
05836c378c util_macros.h: have array pointer point to array of constants
Using the new find_closest() macro can result in the following sparse
warnings.

  drivers/hwmon/lm85.c:194:16: warning:
  		incorrect type in initializer (different modifiers)
  drivers/hwmon/lm85.c:194:16:    expected int *__fc_a
  drivers/hwmon/lm85.c:194:16:    got int static const [toplevel] *<noident>
  drivers/hwmon/lm85.c:210:16: warning:
  		incorrect type in initializer (different modifiers)
  drivers/hwmon/lm85.c:210:16:    expected int *__fc_a
  drivers/hwmon/lm85.c:210:16:    got int const *map

This is because the array passed to find_closest() will typically be
declared as array of constants, but the macro declares a non-constant
pointer to it.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Cc: Bartosz Golaszewski <bgolaszewski@baylibre.com>

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-05-05 17:10:11 -07:00
Alexander Duyck
9545b22da6 vlan: Use eth_proto_is_802_3
Replace "ntohs(proto) >= ETH_P_802_3_MIN" w/ eth_proto_is_802_3(proto).

Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-05 19:24:43 -04:00
Alexander Duyck
2c7a88c252 etherdev: Fix sparse error, make test usable by other functions
This change does two things.  First it fixes a sparse error for the fact
that the __be16 degrades to an integer.  Since that is actually what I am
kind of doing I am simply working around that by forcing both sides of the
comparison to u16.

Also I realized on some compilers I was generating another instruction for
big endian systems such as PowerPC since it was masking the value before
doing the comparison.  So to resolve that I have simply pulled the mask out
and wrapped it in an #ifndef __BIG_ENDIAN.

Lastly I pulled this all out into its own function.  I notices there are
similar checks in a number of other places so this function can be reused
there to help reduce overhead in these paths as well.

Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-05 19:24:42 -04:00
Eric Dumazet
cd8ae85299 tcp: provide SYN headers for passive connections
This patch allows a server application to get the TCP SYN headers for
its passive connections.  This is useful if the server is doing
fingerprinting of clients based on SYN packet contents.

Two socket options are added: TCP_SAVE_SYN and TCP_SAVED_SYN.

The first is used on a socket to enable saving the SYN headers
for child connections. This can be set before or after the listen()
call.

The latter is used to retrieve the SYN headers for passive connections,
if the parent listener has enabled TCP_SAVE_SYN.

TCP_SAVED_SYN is read once, it frees the saved SYN headers.

The data returned in TCP_SAVED_SYN are network (IPv4/IPv6) and TCP
headers.

Original patch was written by Tom Herbert, I changed it to not hold
a full skb (and associated dst and conntracking reference).

We have used such patch for about 3 years at Google.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Tested-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-05 16:02:34 -04:00
d9cee5d4f6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
 "This fixes a build problem with bcm63xx and yet another fix to the
  memzero_explicit function to ensure that the memset is not elided"

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  hwrng: bcm63xx - Fix driver compilation
  lib: make memzero_explicit more robust against dead store elimination
2015-05-05 09:03:52 -07:00
David S. Miller
73e84313ee Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next
Johan Hedberg says:

====================
pull request: bluetooth-next 2015-05-04

Here's the first bluetooth-next pull request for 4.2:

 - Various fixes for at86rf230 driver
 - ieee802154: trace events support for rdev->ops
 - HCI UART driver refactoring
 - New Realtek IDs added to btusb driver
 - Off-by-one fix for rtl8723b in btusb driver
 - Refactoring of btbcm driver for both UART & USB use

Please let me know if there are any issues pulling. Thanks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-04 15:36:07 -04:00
Shaohua Li
b2387ddcce blk-mq: fix FUA request hang
When a FUA request enters its DATA stage of flush pipeline, the
request is added to mq requeue list, the request will then be added to
ctx->rq_list. blk_mq_attempt_merge() might merge the request with a bio.
Later when the request is finished the flush pipeline, the
request->__data_len is 0. Then I only saw the bio gets endio called, the
original request never finish.

Adding REQ_FLUSH_SEQ into REQ_NOMERGE_FLAGS looks an easy fix.

stable: 3.15+

Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-05-04 13:09:55 -06:00
Linus Lüssing
9afd85c9e4 net: Export IGMP/MLD message validation code
With this patch, the IGMP and MLD message validation functions are moved
from the bridge code to IPv4/IPv6 multicast files. Some small
refactoring was done to enhance readibility and to iron out some
differences in behaviour between the IGMP and MLD parsing code (e.g. the
skb-cloning of MLD messages is now only done if necessary, just like the
IGMP part always did).

Finally, these IGMP and MLD message validation functions are exported so
that not only the bridge can use it but batman-adv later, too.

Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-04 14:49:23 -04:00
Daniel Borkmann
7829fb09a2 lib: make memzero_explicit more robust against dead store elimination
In commit 0b053c9518 ("lib: memzero_explicit: use barrier instead
of OPTIMIZER_HIDE_VAR"), we made memzero_explicit() more robust in
case LTO would decide to inline memzero_explicit() and eventually
find out it could be elimiated as dead store.

While using barrier() works well for the case of gcc, recent efforts
from LLVMLinux people suggest to use llvm as an alternative to gcc,
and there, Stephan found in a simple stand-alone user space example
that llvm could nevertheless optimize and thus elimitate the memset().
A similar issue has been observed in the referenced llvm bug report,
which is regarded as not-a-bug.

Based on some experiments, icc is a bit special on its own, while it
doesn't seem to eliminate the memset(), it could do so with an own
implementation, and then result in similar findings as with llvm.

The fix in this patch now works for all three compilers (also tested
with more aggressive optimization levels). Arguably, in the current
kernel tree it's more of a theoretical issue, but imho, it's better
to be pedantic about it.

It's clearly visible with gcc/llvm though, with the below code: if we
would have used barrier() only here, llvm would have omitted clearing,
not so with barrier_data() variant:

  static inline void memzero_explicit(void *s, size_t count)
  {
    memset(s, 0, count);
    barrier_data(s);
  }

  int main(void)
  {
    char buff[20];
    memzero_explicit(buff, sizeof(buff));
    return 0;
  }

  $ gcc -O2 test.c
  $ gdb a.out
  (gdb) disassemble main
  Dump of assembler code for function main:
   0x0000000000400400  <+0>: lea   -0x28(%rsp),%rax
   0x0000000000400405  <+5>: movq  $0x0,-0x28(%rsp)
   0x000000000040040e <+14>: movq  $0x0,-0x20(%rsp)
   0x0000000000400417 <+23>: movl  $0x0,-0x18(%rsp)
   0x000000000040041f <+31>: xor   %eax,%eax
   0x0000000000400421 <+33>: retq
  End of assembler dump.

  $ clang -O2 test.c
  $ gdb a.out
  (gdb) disassemble main
  Dump of assembler code for function main:
   0x00000000004004f0  <+0>: xorps  %xmm0,%xmm0
   0x00000000004004f3  <+3>: movaps %xmm0,-0x18(%rsp)
   0x00000000004004f8  <+8>: movl   $0x0,-0x8(%rsp)
   0x0000000000400500 <+16>: lea    -0x18(%rsp),%rax
   0x0000000000400505 <+21>: xor    %eax,%eax
   0x0000000000400507 <+23>: retq
  End of assembler dump.

As gcc, clang, but also icc defines __GNUC__, it's sufficient to define
this in compiler-gcc.h only to be picked up. For a fallback or otherwise
unsupported compiler, we define it as a barrier. Similarly, for ecc which
does not support gcc inline asm.

Reference: https://llvm.org/bugs/show_bug.cgi?id=15495
Reported-by: Stephan Mueller <smueller@chronox.de>
Tested-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Stephan Mueller <smueller@chronox.de>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: mancha security <mancha1@zoho.com>
Cc: Mark Charlebois <charlebm@gmail.com>
Cc: Behan Webster <behanw@converseincode.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-05-04 17:49:51 +08:00
Tom Herbert
50fb799289 net: Add skb_get_hash_perturb
This calls flow_disect and __skb_get_hash to procure a hash for a
packet. Input includes a key to initialize jhash. This function
does not set skb->hash.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-04 00:09:08 -04:00
Alexander Duyck
d54385ce68 etherdev: Process is_multicast_ether_addr at same size as other operations
This change makes it so that we process the address in
is_multicast_ether_addr at the same size as the other calls.  This allows
us to avoid duplicate reads when used with other calls such as
is_zero_ether_addr or eth_addr_copy.  In addition I have added a 64 bit
version of the function so in eth_type_trans we can process the destination
address as a 64 bit value throughout.

Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-03 22:30:36 -04:00
David S. Miller
3715544750 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Merge net into net-next.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-02 22:05:58 -04:00
6c3c1eb3c3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Receive packet length needs to be adjust by 2 on RX to accomodate
    the two padding bytes in altera_tse driver.  From Vlastimil Setka.

 2) If rx frame is dropped due to out of memory in macb driver, we leave
    the receive ring descriptors in an undefined state.  From Punnaiah
    Choudary Kalluri

 3) Some netlink subsystems erroneously signal NLM_F_MULTI.  That is
    only for dumps.  Fix from Nicolas Dichtel.

 4) Fix mis-use of raw rt->rt_pmtu value in ipv4, one must always go via
    the ipv4_mtu() helper.  From Herbert Xu.

 5) Fix null deref in bridge netfilter, and miscalculated lengths in
    jump/goto nf_tables verdicts.  From Florian Westphal.

 6) Unhash ping sockets properly.

 7) Software implementation of BPF divide did 64/32 rather than 64/64
    bit divide.  The JITs got it right.  Fix from Alexei Starovoitov.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (30 commits)
  ipv4: Missing sk_nulls_node_init() in ping_unhash().
  net: fec: Fix RGMII-ID mode
  net/mlx4_en: Schedule napi when RX buffers allocation fails
  netxen_nic: use spin_[un]lock_bh around tx_clean_lock
  net/mlx4_core: Fix unaligned accesses
  mlx4_en: Use correct loop cursor in error path.
  cxgb4: Fix MC1 memory offset calculation
  bnx2x: Delay during kdump load
  net: Fix Kernel Panic in bonding driver debugfs file: rlb_hash_table
  net: dsa: Fix scope of eeprom-length property
  net: macb: Fix race condition in driver when Rx frame is dropped
  hv_netvsc: Fix a bug in netvsc_start_xmit()
  altera_tse: Correct rx packet length
  mlx4: Fix tx ring affinity_mask creation
  tipc: fix problem with parallel link synchronization mechanism
  tipc: remove wrong use of NLM_F_MULTI
  bridge/nl: remove wrong use of NLM_F_MULTI
  bridge/mdb: remove wrong use of NLM_F_MULTI
  net: sched: act_connmark: don't zap skb->nfct
  trivial: net: systemport: bcmsysport.h: fix 0x0x prefix
  ...
2015-05-01 20:51:04 -07:00
9dbbe3cfc3 Remove from guest code the handling of task migration during a
pvclock read; instead use the correct protocol in KVM.
 
 This removes the need for task migration notifiers in core
 scheduler code.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJVQkUWAAoJEL/70l94x66DhfcH/A8RTHUOELtoy+v2weahn21m
 FFWEnEUlCWzYgmiddgFdlr6+ub386W3ryFsXKPqjrn/8LVv3yS7tK1NJF8d03LQw
 n7HtIsrF01E9UI8CIWO4S/mUxWQev6vEJ9NXtNrsJcRmhSeLaIZkPjTH8Zqyx4i9
 ZvG4731WHXmxvbJ03bfJU9Y8OwHXe55GMi614aTxPndVBGdvIRu2Oj6aTfQTeab/
 7tEujub0MKWp74a7eyNU4GItcvIAXZCQt2wMc5dN1VK3ma5FTOnHIOuhAb8mACFF
 qEeGhtxAnOf7W+s9J8i7zVBdA5MOS0vUKng361ZOVGDb0OLqcVADW7GpuTZfRAM=
 =2A7v
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm changes from Paolo Bonzini:
 "Remove from guest code the handling of task migration during a pvclock
  read; instead use the correct protocol in KVM.

  This removes the need for task migration notifiers in core scheduler
  code"

[ The scheduler people really hated the migration notifiers, so this was
  kind of required  - Linus ]

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  x86: pvclock: Really remove the sched notifier for cross-cpu migrations
  kvm: x86: fix kvmclock update protocol
2015-04-30 09:44:04 -07:00
9263a06a58 TTY/Serial fixes for 4.1-rc2
Here are some small tty/serial driver fixes for 4.1-rc2.
 
 They include some minor fixes that resolve reported issues, and a new
 device quirk.
 
 All have been in linux-next succesfully.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iEYEABECAAYFAlVCMgMACgkQMUfUDdst+ylmRwCgzADm9JPmMS7DX0g21mfVSeQK
 nI0AoIiYm3HHxu7wma7o3DowGLvuScwt
 =8lGO
 -----END PGP SIGNATURE-----

Merge tag 'tty-4.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty/serial fixes from Greg KH:
 "Here are some small tty/serial driver fixes for 4.1-rc2.

  They include some minor fixes that resolve reported issues, and a new
  device quirk.

  All have been in linux-next succesfully"

* tag 'tty-4.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  serial: 8250_pci: Add support for 16 port Exar boards
  serial: samsung: fix serial console break
  tty/serial: at91: maxburst was missing for dma transfers
  serial: of-serial: Remove device_type = "serial" registration
  serial: xilinx: Use platform_get_irq to get irq description structure
  serial: core: Fix kernel-doc build warnings
  tty: Re-add external interface for tty_set_termios()
2015-04-30 09:30:07 -07:00
dcca8de0aa USB fixes for 4.2-rc1
Here are a number of small USB fixes for 4.2-rc2.  They revert one
 problem patch, fix some minor things, and add some new quirks for
 "broken" devices.
 
 All have been in linux-next successfully.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iEYEABECAAYFAlVCMNMACgkQMUfUDdst+ylQJwCgsGHQVK4YgrIOCpIkXoc+riy1
 VWkAnip86mUGKRej4jrrRvTGvm3maeTj
 =/oWf
 -----END PGP SIGNATURE-----

Merge tag 'usb-4.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB fixes from Greg KH:
 "Here are a number of small USB fixes for 4.2-rc2.  They revert one
  problem patch, fix some minor things, and add some new quirks for
  "broken" devices.

  All have been in linux-next successfully"

* tag 'usb-4.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  cdc-acm: prevent infinite loop when parsing CDC headers.
  Revert "usb: host: ehci-msm: Use devm_ioremap_resource instead of devm_ioremap"
  usb: chipidea: otg: remove mutex unlock and lock while stop and start role
  uas: Set max_sectors_240 quirk for ASM1053 devices
  uas: Add US_FL_MAX_SECTORS_240 flag
  uas: Allow uas_use_uas_driver to return usb-storage flags
2015-04-30 09:08:53 -07:00