Commit graph

722274 commits

Author SHA1 Message Date
Eric Biggers
54c1fb39fe X.509: fix comparisons of ->pkey_algo
->pkey_algo used to be an enum, but was changed to a string by commit
4e8ae72a75 ("X.509: Make algo identifiers text instead of enum").  But
two comparisons were not updated.  Fix them to use strcmp().

This bug broke signature verification in certain configurations,
depending on whether the string constants were deduplicated or not.

Fixes: 4e8ae72a75 ("X.509: Make algo identifiers text instead of enum")
Cc: <stable@vger.kernel.org> # v4.6+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2017-12-08 15:13:29 +00:00
Eric Biggers
18026d8668 KEYS: reject NULL restriction string when type is specified
keyctl_restrict_keyring() allows through a NULL restriction when the
"type" is non-NULL, which causes a NULL pointer dereference in
asymmetric_lookup_restriction() when it calls strcmp() on the
restriction string.

But no key types actually use a "NULL restriction" to mean anything, so
update keyctl_restrict_keyring() to reject it with EINVAL.

Reported-by: syzbot <syzkaller@googlegroups.com>
Fixes: 97d3aa0f31 ("KEYS: Add a lookup_restriction function for the asymmetric key type")
Cc: <stable@vger.kernel.org> # v4.12+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2017-12-08 15:13:29 +00:00
Colin Ian King
3d1f025542 security: keys: remove redundant assignment to key_ref
Variable key_ref is being assigned a value that is never read;
key_ref is being re-assigned a few statements later.  Hence this
assignment is redundant and can be removed.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <james.l.morris@oracle.com>
2017-12-08 15:13:29 +00:00
Eric Biggers
aa33003620 X.509: use crypto_shash_digest()
Use crypto_shash_digest() instead of crypto_shash_init() followed by
crypto_shash_finup().  (For simplicity only; they are equivalent.)

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2017-12-08 15:13:29 +00:00
Eric Biggers
72f9a07b6b KEYS: be careful with error codes in public_key_verify_signature()
In public_key_verify_signature(), if akcipher_request_alloc() fails, we
return -ENOMEM.  But that error code was set 25 lines above, and by
accident someone could easily insert new code in between that assigns to
'ret', which would introduce a signature verification bypass.  Make the
code clearer by moving the -ENOMEM down to where it is used.

Additionally, the callers of public_key_verify_signature() only consider
a negative return value to be an error.  This means that if any positive
return value is accidentally introduced deeper in the call stack (e.g.
'return EBADMSG' instead of 'return -EBADMSG' somewhere in RSA),
signature verification will be bypassed.  Make things more robust by
having public_key_verify_signature() warn about positive errors and
translate them into -EINVAL.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2017-12-08 15:13:29 +00:00
Eric Biggers
a80745a6de pkcs7: use crypto_shash_digest()
Use crypto_shash_digest() instead of crypto_shash_init() followed by
crypto_shash_finup().  (For simplicity only; they are equivalent.)

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2017-12-08 15:13:28 +00:00
Eric Biggers
7204eb8590 pkcs7: fix check for self-signed certificate
pkcs7_validate_trust_one() used 'x509->next == x509' to identify a
self-signed certificate.  That's wrong; ->next is simply the link in the
linked list of certificates in the PKCS#7 message.  It should be
checking ->signer instead.  Fix it.

Fortunately this didn't actually matter because when we re-visited
'x509' on the next iteration via 'x509->signer', it was already seen and
not verified, so we returned -ENOKEY anyway.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <james.l.morris@oracle.com>
2017-12-08 15:13:28 +00:00
Eric Biggers
8ecb506d34 pkcs7: return correct error code if pkcs7_check_authattrs() fails
If pkcs7_check_authattrs() returns an error code, we should pass that
error code on, rather than using ENOMEM.

Fixes: 99db443506 ("PKCS#7: Appropriately restrict authenticated attributes and content type")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <james.l.morris@oracle.com>
2017-12-08 15:13:28 +00:00
Eric Biggers
8dfd2f22d3 509: fix printing uninitialized stack memory when OID is empty
Callers of sprint_oid() do not check its return value before printing
the result.  In the case where the OID is zero-length, -EBADMSG was
being returned without anything being written to the buffer, resulting
in uninitialized stack memory being printed.  Fix this by writing
"(bad)" to the buffer in the cases where -EBADMSG is returned.

Fixes: 4f73175d03 ("X.509: Add utility functions to render OIDs as strings")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2017-12-08 15:13:28 +00:00
Eric Biggers
47e0a208fb X.509: fix buffer overflow detection in sprint_oid()
In sprint_oid(), if the input buffer were to be more than 1 byte too
small for the first snprintf(), 'bufsize' would underflow, causing a
buffer overflow when printing the remainder of the OID.

Fortunately this cannot actually happen currently, because no users pass
in a buffer that can be too small for the first snprintf().

Regardless, fix it by checking the snprintf() return value correctly.

For consistency also tweak the second snprintf() check to look the same.

Fixes: 4f73175d03 ("X.509: Add utility functions to render OIDs as strings")
Cc: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <james.l.morris@oracle.com>
2017-12-08 15:13:28 +00:00
Eric Biggers
0f30cbea00 X.509: reject invalid BIT STRING for subjectPublicKey
Adding a specially crafted X.509 certificate whose subjectPublicKey
ASN.1 value is zero-length caused x509_extract_key_data() to set the
public key size to SIZE_MAX, as it subtracted the nonexistent BIT STRING
metadata byte.  Then, x509_cert_parse() called kmemdup() with that bogus
size, triggering the WARN_ON_ONCE() in kmalloc_slab().

This appears to be harmless, but it still must be fixed since WARNs are
never supposed to be user-triggerable.

Fix it by updating x509_cert_parse() to validate that the value has a
BIT STRING metadata byte, and that the byte is 0 which indicates that
the number of bits in the bitstring is a multiple of 8.

It would be nice to handle the metadata byte in asn1_ber_decoder()
instead.  But that would be tricky because in the general case a BIT
STRING could be implicitly tagged, and/or could legitimately have a
length that is not a whole number of bytes.

Here was the WARN (cleaned up slightly):

    WARNING: CPU: 1 PID: 202 at mm/slab_common.c:971 kmalloc_slab+0x5d/0x70 mm/slab_common.c:971
    Modules linked in:
    CPU: 1 PID: 202 Comm: keyctl Tainted: G    B            4.14.0-09238-g1d3b78bbc6e9 #26
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20171110_100015-anatol 04/01/2014
    task: ffff880033014180 task.stack: ffff8800305c8000
    Call Trace:
     __do_kmalloc mm/slab.c:3706 [inline]
     __kmalloc_track_caller+0x22/0x2e0 mm/slab.c:3726
     kmemdup+0x17/0x40 mm/util.c:118
     kmemdup include/linux/string.h:414 [inline]
     x509_cert_parse+0x2cb/0x620 crypto/asymmetric_keys/x509_cert_parser.c:106
     x509_key_preparse+0x61/0x750 crypto/asymmetric_keys/x509_public_key.c:174
     asymmetric_key_preparse+0xa4/0x150 crypto/asymmetric_keys/asymmetric_type.c:388
     key_create_or_update+0x4d4/0x10a0 security/keys/key.c:850
     SYSC_add_key security/keys/keyctl.c:122 [inline]
     SyS_add_key+0xe8/0x290 security/keys/keyctl.c:62
     entry_SYSCALL_64_fastpath+0x1f/0x96

Fixes: 42d5ec27f8 ("X.509: Add an ASN.1 decoder")
Cc: <stable@vger.kernel.org> # v3.7+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <james.l.morris@oracle.com>
2017-12-08 15:13:27 +00:00
Eric Biggers
81a7be2cd6 ASN.1: check for error from ASN1_OP_END__ACT actions
asn1_ber_decoder() was ignoring errors from actions associated with the
opcodes ASN1_OP_END_SEQ_ACT, ASN1_OP_END_SET_ACT,
ASN1_OP_END_SEQ_OF_ACT, and ASN1_OP_END_SET_OF_ACT.  In practice, this
meant the pkcs7_note_signed_info() action (since that was the only user
of those opcodes).  Fix it by checking for the error, just like the
decoder does for actions associated with the other opcodes.

This bug allowed users to leak slab memory by repeatedly trying to add a
specially crafted "pkcs7_test" key (requires CONFIG_PKCS7_TEST_KEY).

In theory, this bug could also be used to bypass module signature
verification, by providing a PKCS#7 message that is misparsed such that
a signature's ->authattrs do not contain its ->msgdigest.  But it
doesn't seem practical in normal cases, due to restrictions on the
format of the ->authattrs.

Fixes: 42d5ec27f8 ("X.509: Add an ASN.1 decoder")
Cc: <stable@vger.kernel.org> # v3.7+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <james.l.morris@oracle.com>
2017-12-08 15:13:27 +00:00
Eric Biggers
e0058f3a87 ASN.1: fix out-of-bounds read when parsing indefinite length item
In asn1_ber_decoder(), indefinitely-sized ASN.1 items were being passed
to the action functions before their lengths had been computed, using
the bogus length of 0x80 (ASN1_INDEFINITE_LENGTH).  This resulted in
reading data past the end of the input buffer, when given a specially
crafted message.

Fix it by rearranging the code so that the indefinite length is resolved
before the action is called.

This bug was originally found by fuzzing the X.509 parser in userspace
using libFuzzer from the LLVM project.

KASAN report (cleaned up slightly):

    BUG: KASAN: slab-out-of-bounds in memcpy ./include/linux/string.h:341 [inline]
    BUG: KASAN: slab-out-of-bounds in x509_fabricate_name.constprop.1+0x1a4/0x940 crypto/asymmetric_keys/x509_cert_parser.c:366
    Read of size 128 at addr ffff880035dd9eaf by task keyctl/195

    CPU: 1 PID: 195 Comm: keyctl Not tainted 4.14.0-09238-g1d3b78bbc6e9 #26
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20171110_100015-anatol 04/01/2014
    Call Trace:
     __dump_stack lib/dump_stack.c:17 [inline]
     dump_stack+0xd1/0x175 lib/dump_stack.c:53
     print_address_description+0x78/0x260 mm/kasan/report.c:252
     kasan_report_error mm/kasan/report.c:351 [inline]
     kasan_report+0x23f/0x350 mm/kasan/report.c:409
     memcpy+0x1f/0x50 mm/kasan/kasan.c:302
     memcpy ./include/linux/string.h:341 [inline]
     x509_fabricate_name.constprop.1+0x1a4/0x940 crypto/asymmetric_keys/x509_cert_parser.c:366
     asn1_ber_decoder+0xb4a/0x1fd0 lib/asn1_decoder.c:447
     x509_cert_parse+0x1c7/0x620 crypto/asymmetric_keys/x509_cert_parser.c:89
     x509_key_preparse+0x61/0x750 crypto/asymmetric_keys/x509_public_key.c:174
     asymmetric_key_preparse+0xa4/0x150 crypto/asymmetric_keys/asymmetric_type.c:388
     key_create_or_update+0x4d4/0x10a0 security/keys/key.c:850
     SYSC_add_key security/keys/keyctl.c:122 [inline]
     SyS_add_key+0xe8/0x290 security/keys/keyctl.c:62
     entry_SYSCALL_64_fastpath+0x1f/0x96

    Allocated by task 195:
     __do_kmalloc_node mm/slab.c:3675 [inline]
     __kmalloc_node+0x47/0x60 mm/slab.c:3682
     kvmalloc ./include/linux/mm.h:540 [inline]
     SYSC_add_key security/keys/keyctl.c:104 [inline]
     SyS_add_key+0x19e/0x290 security/keys/keyctl.c:62
     entry_SYSCALL_64_fastpath+0x1f/0x96

Fixes: 42d5ec27f8 ("X.509: Add an ASN.1 decoder")
Reported-by: Alexander Potapenko <glider@google.com>
Cc: <stable@vger.kernel.org> # v3.7+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2017-12-08 15:13:27 +00:00
Eric Biggers
4dca6ea1d9 KEYS: add missing permission check for request_key() destination
When the request_key() syscall is not passed a destination keyring, it
links the requested key (if constructed) into the "default" request-key
keyring.  This should require Write permission to the keyring.  However,
there is actually no permission check.

This can be abused to add keys to any keyring to which only Search
permission is granted.  This is because Search permission allows joining
the keyring.  keyctl_set_reqkey_keyring(KEY_REQKEY_DEFL_SESSION_KEYRING)
then will set the default request-key keyring to the session keyring.
Then, request_key() can be used to add keys to the keyring.

Both negatively and positively instantiated keys can be added using this
method.  Adding negative keys is trivial.  Adding a positive key is a
bit trickier.  It requires that either /sbin/request-key positively
instantiates the key, or that another thread adds the key to the process
keyring at just the right time, such that request_key() misses it
initially but then finds it in construct_alloc_key().

Fix this bug by checking for Write permission to the keyring in
construct_get_dest_keyring() when the default keyring is being used.

We don't do the permission check for non-default keyrings because that
was already done by the earlier call to lookup_user_key().  Also,
request_key_and_link() is currently passed a 'struct key *' rather than
a key_ref_t, so the "possessed" bit is unavailable.

We also don't do the permission check for the "requestor keyring", to
continue to support the use case described by commit 8bbf4976b5
("KEYS: Alter use of key instantiation link-to-keyring argument") where
/sbin/request-key recursively calls request_key() to add keys to the
original requestor's destination keyring.  (I don't know of any users
who actually do that, though...)

Fixes: 3e30148c3d ("[PATCH] Keys: Make request-key create an authorisation key")
Cc: <stable@vger.kernel.org>	# v2.6.13+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2017-12-08 15:13:27 +00:00
Eric Biggers
a2d8737d5c KEYS: remove unnecessary get/put of explicit dest_keyring
In request_key_and_link(), in the case where the dest_keyring was
explicitly specified, there is no need to get another reference to
dest_keyring before calling key_link(), then drop it afterwards.  This
is because by definition, we already have a reference to dest_keyring.

This change is useful because we'll be making
construct_get_dest_keyring() able to return an error code, and we don't
want to have to handle that error here for no reason.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2017-12-08 15:13:27 +00:00
Linus Torvalds
fd6d2e506c A handful of documentation fixes. The most significant of these addresses
a problem with the new warning mode: it can break the build when confronted
 with a source file containing malformed kerneldoc comments.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJaJcMHAAoJEI3ONVYwIuV6NqEQAMjstwhy1+OwvFBac492CXsN
 O/nbPbz11xtERfXOuAOQ8xAptQbvu4DzFqEQJnuNFmRFwH6fLUxfsF/2Mis6RgWX
 ky5m33uoUonNwoifW36r5Zt4BvQb2p9MUH3V9VnOSVIz9i1QAXufgqhunxe9zByA
 EpnvSRfkMtmO4ZK54oaui729D/yEszkvWJQb7xYFuRGMNfz0ENCrPq1WGYKBuVEj
 kLan49xfGb1sFSjrieNE/1I98Qea9j4HsFN4YcTaXmQalJ0OJZCBo1ybQlSR1gxM
 08lg4xEmyMjFffOh14D1p4qnRfD2h7uwG4R8JuFbSeiDm6Q2KktA+KdjTXbKjkrU
 mXpf/108tbSbDYVBzoQWkls/RgMGScH6qgfVVcrSGjsFaZyTCzxNYF0aMZ9sLSV0
 rZBYLGvKkfDPaVvtIcP/np3QR06bCZuy9/kAlIwvNSP2md34L9Wz+kmxpJQV2kqh
 HL7vnAwXtZHHzxAAtBdZtOq+bUhmEPlUeqJkM9JrI2j5YuZK68GJYZ9kQDMAaldk
 bn6+eRTSbMgEtPs0aKoy+GxUkW6RJudVa+DMDI8g4w2nQ7Bd2367FlDNw8qwTvGB
 bg4pybcNB3ZzCM+xVmqjbw+7j4kidHj0+lhcFfDfJIJ9vRmT9dIC+KrClEL5IWeR
 El2g1Xsh60fH9G1+2c0L
 =ZKWF
 -----END PGP SIGNATURE-----

Merge tag 'docs-4.15-fixes' of git://git.lwn.net/linux

Pull documentation fixes from Jonathan Corbet:
 "A handful of documentation fixes.

  The most significant of these addresses a problem with the new warning
  mode: it can break the build when confronted with a source file
  containing malformed kerneldoc comments"

* tag 'docs-4.15-fixes' of git://git.lwn.net/linux:
  Documentation: fix docs build error after source file removed
  scsi: documentation: Fix case of 'scsi_device' struct mention(s)
  genericirq.rst: Remove :c:func:`...` in code blocks
  dmaengine: doc : Fix warning "Title underline too short" while make xmldocs
  scripts/kernel-doc: Don't fail with status != 0 if error encountered with -none
2017-12-04 13:55:28 -08:00
Linus Torvalds
2391f0b480 virtio,qemu: bugfixes
A couple of bugfixes that just became ready.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJaIW15AAoJECgfDbjSjVRprxIH/RzUfSAfgaxhIVm2j3CRarGS
 G0xJvKxuXpiDnEtfu5st9QRqNo8CMuPV02kIaOb0Aqq2GfdEb0sCFMTLbfnYuXul
 uECu3uquXMUbuRXffQCHnDN4qqDWKRim8STIDPUlxjnBQqkos+JFn+MQ3vF2KEI0
 SeA8u1YQs5Ji10ZPjZ7FEAirbxDFiLPb0Hb0tphBGSZPXBRV3qoA4qeKNOy45EkQ
 F4J/1YxX/nWca0HYjBl2NceIqc4U/HcNZpner5YcJLad6SQSU9mSfEnJJIhOrRmu
 /Ivn6Q1822/YVdePDcffLIrvFvM4HprT4BUYv0Cd/cTZZRFMCa16OXmDcPrWcLo=
 =Nvzh
 -----END PGP SIGNATURE-----

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio fixes from Michael Tsirkin:
 "virtio and qemu bugfixes

  A couple of bugfixes that just became ready"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  virtio_balloon: fix increment of vb->num_pfns in fill_balloon()
  virtio: release virtio index when fail to device_register
  fw_cfg: fix driver remove
2017-12-04 11:32:02 -08:00
Linus Torvalds
236fa078c6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Various TCP control block fixes, including one that crashes with
    SELinux, from David Ahern and Eric Dumazet.

 2) Fix ACK generation in rxrpc, from David Howells.

 3) ipvlan doesn't set the mark properly in the ipv4 route lookup key,
    from Gao Feng.

 4) SIT configuration doesn't take on the frag_off ipv4 field
    configuration properly, fix from Hangbin Liu.

 5) TSO can fail after device down/up on stmmac, fix from Lars Persson.

 6) Various bpftool fixes (mostly in JSON handling) from Quentin Monnet.

 7) Various SKB leak fixes in vhost/tun/tap (mostly observed as
    performance problems). From Wei Xu.

 8) mvpps's TX descriptors were not zero initialized, from Yan Markman.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (57 commits)
  tcp: use IPCB instead of TCP_SKB_CB in inet_exact_dif_match()
  tcp: add tcp_v4_fill_cb()/tcp_v4_restore_cb()
  rxrpc: Fix the MAINTAINERS record
  rxrpc: Use correct netns source in rxrpc_release_sock()
  liquidio: fix incorrect indentation of assignment statement
  stmmac: reset last TSO segment size after device open
  ipvlan: Add the skb->mark as flow4's member to lookup route
  s390/qeth: build max size GSO skbs on L2 devices
  s390/qeth: fix GSO throughput regression
  s390/qeth: fix thinko in IPv4 multicast address tracking
  tap: free skb if flags error
  tun: free skb in early errors
  vhost: fix skb leak in handle_rx()
  bnxt_en: Fix a variable scoping in bnxt_hwrm_do_send_msg()
  bnxt_en: fix dst/src fid for vxlan encap/decap actions
  bnxt_en: wildcard smac while creating tunnel decap filter
  bnxt_en: Need to unconditionally shut down RoCE in bnxt_shutdown
  phylink: ensure we take the link down when phylink_stop() is called
  sfp: warn about modules requiring address change sequence
  sfp: improve RX_LOS handling
  ...
2017-12-04 11:14:46 -08:00
Chris Metcalf
8ee5ad1d4c arch/tile: mark as orphaned
The chip family of TILEPro and TILE-Gx was developed by Tilera, which
was eventually acquired by Mellanox.  The tile architecture was added to
the kernel in 2010 and first appeared in 2.6.36.

Now at Mellanox we are developing new chips based on the ARM64
architecture; our last TILE-Gx chip (the Gx72) was released in 2013, and
our customers using tile architecture products are not, as far as we
know, looking to upgrade to newer kernel releases.  In the absence of
someone in the community stepping up to take over maintainership, this
commit marks the architecture as orphaned.

Cc: Chris Metcalf <metcalf@alum.mit.edu>
Signed-off-by: Chris Metcalf <cmetcalf@mellanox.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-12-04 11:09:31 -08:00
Randy Dunlap
9956cfef34 Documentation: fix docs build error after source file removed
The pci/htirq.c file was removed so remove it from the documentation
file also.

Error: Cannot open file ../drivers/pci/htirq.c
WARNING: kernel-doc '../scripts/kernel-doc -rst -enable-lineno -export ../drivers/pci/htirq.c' failed with return code 2

Fixes: fd2fa6c18b ("x86/PCI: Remove unused HyperTransport interrupt support")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2017-12-03 15:11:45 -07:00
David S. Miller
c2eb6d07a6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2017-12-02

The following pull-request contains BPF updates for your *net* tree.

The main changes are:

1) Fix a compilation warning in xdp redirect tracepoint due to
   missing bpf.h include that pulls in struct bpf_map, from Xie.

2) Limit the maximum number of attachable BPF progs for a given
   perf event as long as uabi is not frozen yet. The hard upper
   limit is now 64 and therefore the same as with BPF multi-prog
   for cgroups. Also add related error checking for the sample
   BPF loader when enabling and attaching to the perf event, from
   Yonghong.

3) Specifically set the RLIMIT_MEMLOCK for the test_verifier_log
   case, so that the test case can always pass and not fail in
   some environments due to too low default limit, also from
   Yonghong.

4) Fix up a missing license header comment for kernel/bpf/offload.c,
   from Jakub.

5) Several fixes for bpftool, among others a crash on incorrect
   arguments when json output is used, error message handling
   fixes on unknown options and proper destruction of json writer
   for some exit cases, all from Quentin.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-03 13:08:30 -05:00
David S. Miller
e4485c7484 Merge branch 'tcp-cb-selinux-corruption'
Eric Dumazet says:

====================
tcp: add tcp_v4_fill_cb()/tcp_v4_restore_cb()

James Morris reported kernel stack corruption bug that
we tracked back to commit 971f10eca1 ("tcp: better TCP_SKB_CB
layout to reduce cache line misses")

First patch needs to be backported to kernels >= 3.18,
while second patch needs to be backported to kernels >= 4.9, since
this was the time when inet_exact_dif_match appeared.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-03 12:39:15 -05:00
David Ahern
b4d1605a8e tcp: use IPCB instead of TCP_SKB_CB in inet_exact_dif_match()
After this fix : ("tcp: add tcp_v4_fill_cb()/tcp_v4_restore_cb()"),
socket lookups happen while skb->cb[] has not been mangled yet by TCP.

Fixes: a04a480d43 ("net: Require exact match for TCP socket lookups if dif is l3mdev")
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-03 12:39:15 -05:00
Eric Dumazet
eeea10b83a tcp: add tcp_v4_fill_cb()/tcp_v4_restore_cb()
James Morris reported kernel stack corruption bug [1] while
running the SELinux testsuite, and bisected to a recent
commit bffa72cf7f ("net: sk_buff rbnode reorg")

We believe this commit is fine, but exposes an older bug.

SELinux code runs from tcp_filter() and might send an ICMP,
expecting IP options to be found in skb->cb[] using regular IPCB placement.

We need to defer TCP mangling of skb->cb[] after tcp_filter() calls.

This patch adds tcp_v4_fill_cb()/tcp_v4_restore_cb() in a very
similar way we added them for IPv6.

[1]
[  339.806024] SELinux: failure in selinux_parse_skb(), unable to parse packet
[  339.822505] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffff81745af5
[  339.822505]
[  339.852250] CPU: 4 PID: 3642 Comm: client Not tainted 4.15.0-rc1-test #15
[  339.868498] Hardware name: LENOVO 10FGS0VA1L/30BC, BIOS FWKT68A   01/19/2017
[  339.885060] Call Trace:
[  339.896875]  <IRQ>
[  339.908103]  dump_stack+0x63/0x87
[  339.920645]  panic+0xe8/0x248
[  339.932668]  ? ip_push_pending_frames+0x33/0x40
[  339.946328]  ? icmp_send+0x525/0x530
[  339.958861]  ? kfree_skbmem+0x60/0x70
[  339.971431]  __stack_chk_fail+0x1b/0x20
[  339.984049]  icmp_send+0x525/0x530
[  339.996205]  ? netlbl_skbuff_err+0x36/0x40
[  340.008997]  ? selinux_netlbl_err+0x11/0x20
[  340.021816]  ? selinux_socket_sock_rcv_skb+0x211/0x230
[  340.035529]  ? security_sock_rcv_skb+0x3b/0x50
[  340.048471]  ? sk_filter_trim_cap+0x44/0x1c0
[  340.061246]  ? tcp_v4_inbound_md5_hash+0x69/0x1b0
[  340.074562]  ? tcp_filter+0x2c/0x40
[  340.086400]  ? tcp_v4_rcv+0x820/0xa20
[  340.098329]  ? ip_local_deliver_finish+0x71/0x1a0
[  340.111279]  ? ip_local_deliver+0x6f/0xe0
[  340.123535]  ? ip_rcv_finish+0x3a0/0x3a0
[  340.135523]  ? ip_rcv_finish+0xdb/0x3a0
[  340.147442]  ? ip_rcv+0x27c/0x3c0
[  340.158668]  ? inet_del_offload+0x40/0x40
[  340.170580]  ? __netif_receive_skb_core+0x4ac/0x900
[  340.183285]  ? rcu_accelerate_cbs+0x5b/0x80
[  340.195282]  ? __netif_receive_skb+0x18/0x60
[  340.207288]  ? process_backlog+0x95/0x140
[  340.218948]  ? net_rx_action+0x26c/0x3b0
[  340.230416]  ? __do_softirq+0xc9/0x26a
[  340.241625]  ? do_softirq_own_stack+0x2a/0x40
[  340.253368]  </IRQ>
[  340.262673]  ? do_softirq+0x50/0x60
[  340.273450]  ? __local_bh_enable_ip+0x57/0x60
[  340.285045]  ? ip_finish_output2+0x175/0x350
[  340.296403]  ? ip_finish_output+0x127/0x1d0
[  340.307665]  ? nf_hook_slow+0x3c/0xb0
[  340.318230]  ? ip_output+0x72/0xe0
[  340.328524]  ? ip_fragment.constprop.54+0x80/0x80
[  340.340070]  ? ip_local_out+0x35/0x40
[  340.350497]  ? ip_queue_xmit+0x15c/0x3f0
[  340.361060]  ? __kmalloc_reserve.isra.40+0x31/0x90
[  340.372484]  ? __skb_clone+0x2e/0x130
[  340.382633]  ? tcp_transmit_skb+0x558/0xa10
[  340.393262]  ? tcp_connect+0x938/0xad0
[  340.403370]  ? ktime_get_with_offset+0x4c/0xb0
[  340.414206]  ? tcp_v4_connect+0x457/0x4e0
[  340.424471]  ? __inet_stream_connect+0xb3/0x300
[  340.435195]  ? inet_stream_connect+0x3b/0x60
[  340.445607]  ? SYSC_connect+0xd9/0x110
[  340.455455]  ? __audit_syscall_entry+0xaf/0x100
[  340.466112]  ? syscall_trace_enter+0x1d0/0x2b0
[  340.476636]  ? __audit_syscall_exit+0x209/0x290
[  340.487151]  ? SyS_connect+0xe/0x10
[  340.496453]  ? do_syscall_64+0x67/0x1b0
[  340.506078]  ? entry_SYSCALL64_slow_path+0x25/0x25

Fixes: 971f10eca1 ("tcp: better TCP_SKB_CB layout to reduce cache line misses")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: James Morris <james.l.morris@oracle.com>
Tested-by: James Morris <james.l.morris@oracle.com>
Tested-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-03 12:39:14 -05:00
Linus Torvalds
ae64f9bd1d Linux 4.15-rc2 2017-12-03 11:01:47 -05:00
Linus Torvalds
87fc5c686e Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm
Pull ARM fix from Russell King:
 "Just one fix this time around, for the late commit in the merge window
  that triggered a problem with qemu. Qemu is apparently also going to
  receive a fix for the discovered issue"

* 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
  ARM: avoid faulting on qemu
2017-12-03 10:51:08 -05:00
Linus Torvalds
ae4806a38b Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
 "Here are two bugfixes for I2C, fixing a memleak in the core and irq
  allocation for i801.

  Also three bugfixes for the at24 eeprom driver which Bartosz collected
  while taking over maintainership for this driver"

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  eeprom: at24: check at24_read/write arguments
  eeprom: at24: fix reading from 24MAC402/24MAC602
  eeprom: at24: correctly set the size for at24mac402
  i2c: i2c-boardinfo: fix memory leaks on devinfo
  i2c: i801: Fix Failed to allocate irq -2147483648 error
2017-12-03 10:48:24 -05:00
Linus Torvalds
49a418d783 hwmon fixes for v4.15-rc2
Drop reference to obsolete maintainer tree
 Fix overflow bug in pmbus driver
 Fix SMBUS timeout problem in jc42 driver
 
 For the SMBUS timeout handling, we had a brief discussion if this should
 be considered a bug fix or a feature. Peter says "it fixes real problems
 where the application misbehave due to faulty content when reading from
 an eeprom", and he needs the patch in his company's v4.14 images. This is
 good enough for me and warrants backport to stable kernels.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJaItAeAAoJEMsfJm/On5mBYrQQAJz+Ukg8blLKg1bqb01nZxEY
 4vOxqZySpqR5icW6KP0zn/ws8eKyFjGQQsRxoRePo9Zfj2Y9RKRKVOoRrs7McZ6s
 mO12KJBf13cGiB+Msm8JsjJv81E15RnDwsWw39+SPms3ueKiBDAl4xaH6PSeGKTZ
 zB8teDk7MLLwCplRuNbB3qrc0BCj4AgeAu3omHO7PpKClOCHRieJPLaFE2Fpzsu/
 P4RxUn4CFY0urgWJ5b9g5A3FdH8lOz8nfkiWPPnEb/IF+8tR9M3GYzwOp5r2uVul
 uKszDMKKx3Q+Hi+67/Ou2uLhCDnaxYtFHiN+REB9dRi33BHSuIsc4riCqa1ZMz8c
 XWQLbQq3u0bS1XgiegD4nihF2iNrj0fMcy2dcnUVWJNFKrfHjGAIodY0cKZJsZKW
 RqzWlX/aUVIpCSKxtJm0xDNPJS5FqKXLCV0xsNMF2Mz2JosT8o4IARZNlY7Ap0Be
 kRiQMXA/3y2RyeOUi82YHM0MorMt4icmTT3ztRrJpVbM1MBiiX2SefZquai5RLgp
 o/qTOrJ0gD8XzBhwP8wQYP5BvpPX2UX3V0sjLcRDNakqaOaDuslAMtzZ0sNn37Ng
 R+sAJNuFkLZkzhsa8IZSidngWMLvGS3Zjh2N75v6HcEaLrVoK2p6rBJM82Dg8cmp
 0GwZkjd72bdXIXuRLpri
 =5rYP
 -----END PGP SIGNATURE-----

Merge tag 'hwmon-for-linus-v4.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging

Pull hwmon fixes from Guenter Roeck:
 "Fixes:

   - Drop reference to obsolete maintainer tree

   - Fix overflow bug in pmbus driver

   - Fix SMBUS timeout problem in jc42 driver

  For the SMBUS timeout handling, we had a brief discussion if this
  should be considered a bug fix or a feature. Peter says "it fixes real
  problems where the application misbehave due to faulty content when
  reading from an eeprom", and he needs the patch in his company's v4.14
  images. This is good enough for me and warrants backport to stable
  kernels"

* tag 'hwmon-for-linus-v4.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (jc42) optionally try to disable the SMBUS timeout
  hwmon: (pmbus) Use 64bit math for DIRECT format values
  hwmon: Drop reference to Jean's tree
2017-12-03 10:46:16 -05:00
David Howells
bcd1d601e5 rxrpc: Fix the MAINTAINERS record
Fix the MAINTAINERS record so that it's more obvious who the maintainer for
AF_RXRPC is.

Reported-by: Joe Perches <joe@perches.com>
Reported-by: David Miller <davem@davemloft.net>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-03 10:05:20 -05:00
David Howells
c501256406 rxrpc: Use correct netns source in rxrpc_release_sock()
In rxrpc_release_sock() there may be no rx->local value to access, so we
can't unconditionally follow it to the rxrpc network namespace information
to poke the connection reapers.

Instead, use the socket's namespace pointer to find the namespace.

This unfixed code causes the following static checker warning:

	net/rxrpc/af_rxrpc.c:898 rxrpc_release_sock()
	error: we previously assumed 'rx->local' could be null (see line 887)

Fixes: 3d18cbb7fd ("rxrpc: Fix conn expiry timers")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-03 10:05:20 -05:00
Colin Ian King
886afc1dc4 liquidio: fix incorrect indentation of assignment statement
Remove one extraneous level of indentation on assignment statement.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-03 09:59:38 -05:00
David S. Miller
ed75e1ac1c linux-can-fixes-for-4.15-20171201
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEE4bay/IylYqM/npjQHv7KIOw4HPYFAlohX1sTHG1rbEBwZW5n
 dXRyb25peC5kZQAKCRAe/sog7Dgc9sjPCACWLFU4N0upk2+QutPK5+VP0hKlkkpj
 FRM3JVUt+PxsLpnMj5C99f55poBqT8njadzLNXX1iZG3kXjt+LE8xF5VPc9jifYa
 wKHMNFedaDgyw4bEUTcDrtcDhP16gcxwDC7Wfwxi1pSoISs1S/KT3e82Ih3es92F
 mw+LxdgMHuOzN5fByIiwGUC3qdiNQLAkzAbbzEX6L/A7QSxraecMREUodJNrzDKo
 fD13xyCiq8alEIYfDFjLA/8K9645O4RTpZfIF0HyNtXTfaG4vl+/OswXLuY/xmJq
 X7pmUBppKwV3vt0nIGZUNy/qoaf8MbIkh6l2jzou94Ys3xNUfSErkwKQ
 =WgyM
 -----END PGP SIGNATURE-----

Merge tag 'linux-can-fixes-for-4.15-20171201' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
pull-request: can 2017-12-01

this is a pull for net consisting of nine patches.

The first three patches are by Jimmy Assarsson for the kvaser_usb driver
and add the missing free()s in some error path, a signed/unsigned
comparison and ratelimit the error messages in case of incomplete
messages. Oliver Stäbler's patch for the ti_hecc driver fix the napi
poll function's return value. The return values of the probe function of
the peak_canfd and peak_pci PCI drivers are fixed by Stephane Grosjean's
patch. Two patches by me for the flexcan driver update the
bugs/features/quirks overview table and fix the error state transition
for the VF610 SoC. The two patches by Martin Kelly for the mcba_usb
driver fix a typo and a device disconnect bug.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-03 09:57:54 -05:00
Wolfram Sang
edef30980d AT24 fixes for v4.15
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEFp3rbAvDxGAT0sefEacuoBRx13IFAlofxdUACgkQEacuoBRx
 13JvfA//bB7nUyvHlglfYMOq6z4O3L7IB5bbWs+Z5hoccR0nsnYPXwe3huIyuxBa
 vQLgztqpyzcsT3LYDWS7sD/NQQoHF0it2ZOSRl7pYo83I1KXeKcbimDp3NmurG89
 kx8AEtdhD4XkP/E7IwCsYlO7xTms0d9hShoyy0+0/GB4St5u+NOip+zT3TQVjJBS
 +ChnHMala2WBQji0wXmfOwFGHGEeEZXx5ZdrIheEiedFgOV0k7r/9IwaHVfh+DUb
 Lyb8fRCqTWwUblyky8nybSAtl4ki4jU5FJhiPsa3tI0MO2Vt5kzwWHYCYyQXo8D5
 BEYW1gsFY2R2SV/QF3SNmfWK+HQLgZl+MYzslCd25GiSDFD4mMhRvi85twig8mC8
 w86oaY22IVPAh5aUeF7W1FyRdYmAESsG2gOOG5dyxf8XPeGL7IqaV+GJkj2tPTdC
 OQ9q2hnO08e10e7Nub4k6NCWrIXK4WdNSjyRJz7DE2bvhWtYtnlDB3pCSKpxRNp7
 6aHOMKHnJyNsGIGYmfq7/Zyq511EtYux3xSAZa3NwnnukEE5CeHnYleO4IXXBxNU
 reVIq14QZ5AngT0QF7p+oE0evf2bqeNrv8i2UFF/qqtEA6mCbYnVsgu0+vzuy31t
 k1X/PkfAgQLdqI5TDDABP3y++PSfdWp07hIdlG9wPKVlwjFj2z8=
 =ldzq
 -----END PGP SIGNATURE-----

Merge tag 'at24-4.15-fixes-for-wolfram' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux into i2c/for-current

Please consider pulling the following fixes for v4.15. While it doesn't
fix any regression introduced in the v4.15 merge window, we have a
feature in at24 since linux v4.8 - reading the mac address block from
at24mac series - which turned out to be not working.

This pull request contains changes that fix it together with a patch
that hardens the read and write argument sanitization with
out-of-bounds checks that were missing.
2017-12-03 15:55:20 +01:00
Lars Persson
45ab4b13e4 stmmac: reset last TSO segment size after device open
The mss variable tracks the last max segment size sent to the TSO
engine. We do not update the hardware as long as we receive skb:s with
the same value in gso_size.

During a network device down/up cycle (mapped to stmmac_release() and
stmmac_open() callbacks) we issue a reset to the hardware and it
forgets the setting for mss. However we did not zero out our mss
variable so the next transmission of a gso packet happens with an
undefined hardware setting.

This triggers a hang in the TSO engine and eventuelly the netdev
watchdog will bark.

Fixes: f748be531d ("stmmac: support new GMAC4")
Signed-off-by: Lars Persson <larper@axis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-03 09:47:42 -05:00
Gao Feng
a98a4ebc8c ipvlan: Add the skb->mark as flow4's member to lookup route
Current codes don't use skb->mark to assign flowi4_mark, it would
make the policy route rule with fwmark doesn't work as expected.

Signed-off-by: Gao Feng <gfree.wind@vip.163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-03 09:44:02 -05:00
David S. Miller
af57b7fffa Merge branch 's390-qeth-fixes'
Julian Wiedmann says:

====================
s390/qeth: fixes 2017-12-01

please apply the following three fixes for 4.15. These should also go
back to stable.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-02 21:35:21 -05:00
Julian Wiedmann
0cbff6d454 s390/qeth: build max size GSO skbs on L2 devices
The current GSO skb size limit was copy&pasted over from the L3 path,
where it is needed due to a TSO limitation.
As L2 devices don't offer TSO support (and thus all GSO skbs are
segmented before they reach the driver), there's no reason to restrict
the stack in how large it may build the GSO skbs.

Fixes: d52aec97e5 ("qeth: enable scatter/gather in layer 2 mode")
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-02 21:35:21 -05:00
Julian Wiedmann
6d69b1f1eb s390/qeth: fix GSO throughput regression
Using GSO with small MTUs currently results in a substantial throughput
regression - which is caused by how qeth needs to map non-linear skbs
into its IO buffer elements:
compared to a linear skb, each GSO-segmented skb effectively consumes
twice as many buffer elements (ie two instead of one) due to the
additional header-only part. This causes the Output Queue to be
congested with low-utilized IO buffers.

Fix this as follows:
If the MSS is low enough so that a non-SG GSO segmentation produces
order-0 skbs (currently ~3500 byte), opt out from NETIF_F_SG. This is
where we anticipate the biggest savings, since an SG-enabled
GSO segmentation produces skbs that always consume at least two
buffer elements.

Larger MSS values continue to get a SG-enabled GSO segmentation, since
1) the relative overhead of the additional header-only buffer element
becomes less noticeable, and
2) the linearization overhead increases.

With the throughput regression fixed, re-enable NETIF_F_SG by default to
reap the significant CPU savings of GSO.

Fixes: 5722963a8e ("qeth: do not turn on SG per default")
Reported-by: Nils Hoppmann <niho@de.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-02 21:35:21 -05:00
Julian Wiedmann
bc3ab70584 s390/qeth: fix thinko in IPv4 multicast address tracking
Commit 5f78e29cee ("qeth: optimize IP handling in rx_mode callback")
reworked how secondary addresses are managed for qeth devices.
Instead of dropping & subsequently re-adding all addresses on every
ndo_set_rx_mode() call, qeth now keeps track of the addresses that are
currently registered with the HW.
On a ndo_set_rx_mode(), we thus only need to do (de-)registration
requests for the addresses that have actually changed.

On L3 devices, the lookup for IPv4 Multicast addresses checks the wrong
hashtable - and thus never finds a match. As a result, we first delete
*all* such addresses, and then re-add them again. So each set_rx_mode()
causes a short period where the IPv4 Multicast addresses are not
registered, and the card stops forwarding inbound traffic for them.

Fix this by setting the ->is_multicast flag on the lookup object, thus
enabling qeth_l3_ip_from_hash() to search the correct hashtable and
find a match there.

Fixes: 5f78e29cee ("qeth: optimize IP handling in rx_mode callback")
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-02 21:35:20 -05:00
David S. Miller
7344ba039f Merge branch 'vhost-skb-leaks'
Wei Xu says:

====================
vhost: fix a few skb leaks

Matthew found a roughly 40% tcp throughput regression with commit
c67df11f(vhost_net: try batch dequing from skb array) as discussed
in the following thread:
https://www.mail-archive.com/netdev@vger.kernel.org/msg187936.html

v4:
- fix zero iov iterator count in tap/tap_do_read()(Jason)
- don't put tun in case of EBADFD(Jason)
- Replace msg->msg_control with new 'skb' when calling tun/tap_do_read()

v3:
- move freeing skb from vhost to tun/tap recvmsg() to not
  confuse the callers.

v2:
- add Matthew as the reporter, thanks matthew.
- moving zero headcount check ahead instead of defer consuming skb
  due to jason and mst's comment.
- add freeing skb in favor of recvmsg() fails.
====================

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-02 21:31:04 -05:00
Wei Xu
61d7853784 tap: free skb if flags error
tap_recvmsg() supports accepting skb by msg_control after
commit 3b4ba04acc ("tap: support receiving skb from msg_control"),
the skb if presented should be freed within the function, otherwise
it would be leaked.

Signed-off-by: Wei Xu <wexu@redhat.com>
Reported-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-02 21:31:03 -05:00
Wei Xu
c33ee15b38 tun: free skb in early errors
tun_recvmsg() supports accepting skb by msg_control after
commit ac77cfd425 ("tun: support receiving skb through msg_control"),
the skb if presented should be freed no matter how far it can go
along, otherwise it would be leaked.

This patch fixes several missed cases.

Signed-off-by: Wei Xu <wexu@redhat.com>
Reported-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-02 21:31:03 -05:00
Wei Xu
6e474083f3 vhost: fix skb leak in handle_rx()
Matthew found a roughly 40% tcp throughput regression with commit
c67df11f(vhost_net: try batch dequing from skb array) as discussed
in the following thread:
https://www.mail-archive.com/netdev@vger.kernel.org/msg187936.html

Eventually we figured out that it was a skb leak in handle_rx()
when sending packets to the VM. This usually happens when a guest
can not drain out vq as fast as vhost fills in, afterwards it sets
off the traffic jam and leaks skb(s) which occurs as no headcount
to send on the vq from vhost side.

This can be avoided by making sure we have got enough headcount
before actually consuming a skb from the batched rx array while
transmitting, which is simply done by moving checking the zero
headcount a bit ahead.

Signed-off-by: Wei Xu <wexu@redhat.com>
Reported-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-02 21:31:03 -05:00
David S. Miller
fa935ca224 Merge branch 'bnxt_en-fixes'
Michael Chan says:

====================
bnxt_en: Fixes.

A shutdown fix for SMARTNIC, 2 fixes related to TC Flower vxlan
filters, and the last one fixes an out-of-scope variable when sending
short firmware messages.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-02 21:25:39 -05:00
Vasundhara Volam
ebd5818cc5 bnxt_en: Fix a variable scoping in bnxt_hwrm_do_send_msg()
short_input variable is assigned to another data pointer which is
referred out of its scope. Fix it by moving short_input definition
to the beginning of bnxt_hwrm_do_send_msg() function.

No failure has been reported so far due to this issue.

Fixes: e605db801b ("bnxt_en: Support for Short Firmware Message")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-02 21:25:38 -05:00
Sathya Perla
e9ecc731a8 bnxt_en: fix dst/src fid for vxlan encap/decap actions
For flows that involve a vxlan encap action, the vxlan sock
interface may be specified as the outgoing interface. The driver
must resolve the outgoing PF interface used by this socket and
use the dst_fid of the PF in the hwrm_cfa_encap_record_alloc cmd.

Similarily for flows that have a vxlan decap action, the
fid of the incoming PF interface must be used as the src_fid in
the hwrm_cfa_decap_filter_alloc cmd.

Fixes: 8c95f773b4 ("bnxt_en: add support for Flower based vxlan encap/decap offload")
Signed-off-by: Sathya Perla <sathya.perla@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-02 21:25:38 -05:00
Sunil Challa
c8fb7b8259 bnxt_en: wildcard smac while creating tunnel decap filter
While creating a decap filter the tunnel smac need not (and must not) be
specified as we cannot ascertain the neighbor in the recv path. 'ttl'
match is also not needed for the decap filter and must be wild-carded.

Fixes: f484f6782e ("bnxt_en: add hwrm FW cmds for cfa_encap_record and decap_filter")
Signed-off-by: Sunil Challa <sunilkumar.challa@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-02 21:25:38 -05:00
Ray Jui
a7f3f939dd bnxt_en: Need to unconditionally shut down RoCE in bnxt_shutdown
The current 'bnxt_shutdown' implementation only invokes
'bnxt_ulp_shutdown' to shut down RoCE in the case when the system is in
the path of power off (SYSTEM_POWER_OFF). While this may work in most
cases, it does not work in the smart NIC case, when Linux 'reboot'
command is initiated from the Linux that runs on the ARM cores of the
NIC card. In this particular case, Linux 'reboot' results in a system
'L3' level reset where the entire ARM and associated subsystems are
being reset, but at the same time, Nitro core is being kept in sane state
(to allow external PCIe connected servers to continue to work). Without
properly shutting down RoCE and freeing all associated resources, it
results in the ARM core to hang immediately after the 'reboot'

By always invoking 'bnxt_ulp_shutdown' in 'bnxt_shutdown', it fixes the
above issue

Fixes: 0efd2fc65c ("bnxt_en: Add a callback to inform RDMA driver during PCI shutdown.")

Signed-off-by: Ray Jui <ray.jui@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-02 21:25:38 -05:00
John Pittman
a08415ea2a scsi: documentation: Fix case of 'scsi_device' struct mention(s)
In scsi_mid_low_api.txt a the scsi_device structure is mentioned
several times, but the leading 's' is uppercase (Scsi_device)
and should be lowercase (scsi_device).  Fixed by this commit.

Signed-off-by: John Pittman <jpittman@redhat.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2017-12-02 08:43:43 -07:00
Jonathan Neuschäfer
0f83aaa3c0 genericirq.rst: Remove :c:func:... in code blocks
In code blocks, :c:func:`...` annotations don't result in
cross-references. Instead, they are rendered verbatim.  Remove these
broken annotations, and mark function calls with parentheses() again.

Fixes: 76d40fae13 ("genericirq.rst: add cross-reference links and use monospaced fonts")
Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2017-12-02 08:41:46 -07:00