syzbot reports following splat:
BUG: KMSAN: uninit-value in ebt_stp_mt_check+0x24b/0x450
net/bridge/netfilter/ebt_stp.c:162
ebt_stp_mt_check+0x24b/0x450 net/bridge/netfilter/ebt_stp.c:162
xt_check_match+0x1438/0x1650 net/netfilter/x_tables.c:506
ebt_check_match net/bridge/netfilter/ebtables.c:372 [inline]
ebt_check_entry net/bridge/netfilter/ebtables.c:702 [inline]
The uninitialised access is
xt_mtchk_param->nft_compat
... which should be set to 0.
Fix it by zeroing the struct beforehand, same for tgchk.
ip(6)tables targetinfo uses c99-style initialiser, so no change
needed there.
Reported-by: syzbot+da4494182233c23a5fcf@syzkaller.appspotmail.com
Fixes: 55917a21d0 ("netfilter: x_tables: add context to know if extension runs from nft_compat")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Jozsef Kadlecsik says:
====================
ipset patches for nf
- Check hook mask for unsupported hooks instead of supported ones in xt_set.
(Serhey Popovych).
- List/save just timing out entries with "timeout 1" instead of "timeout 0":
zero timeout value means permanent entries. When restoring the elements,
we'd add non-timing out entries. Fixes netfilter bugzilla id #1258.
- Limit max timeout value to (UINT_MAX >> 1)/MSEC_PER_SEC due to the
negative value condition in msecs_to_jiffies(). msecs_to_jiffies()
should be revised: if one wants to set the timeout above 2147483,
msecs_to_jiffies() sets the value to 4294967. (Reported by Maxim Masiutin).
- Forbid family for hash:mac sets in the kernel module: ipset userspace tool
enforces it but third party tools could create sets with this parameter.
Such sets then cannot be listed/saved with ipset itself. (Florent Fourcot)
====================
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
the ebtables evaluation loop expects targets to return
positive values (jumps), or negative values (absolute verdicts).
This is completely different from what xtables does.
In xtables, targets are expected to return the standard netfilter
verdicts, i.e. NF_DROP, NF_ACCEPT, etc.
ebtables will consider these as jumps.
Therefore reject any target found due to unspec fallback.
v2: also reject watchers. ebtables ignores their return value, so
a target that assumes skb ownership (and returns NF_STOLEN) causes
use-after-free.
The only watchers in the 'ebtables' front-end are log and nflog;
both have AF_BRIDGE specific wrappers on kernel side.
Reported-by: syzbot+2b43f681169a2a0d306a@syzkaller.appspotmail.com
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Userspace `ipset` command forbids family option for hash:mac type:
ipset create test hash:mac family inet4
ipset v6.30: Unknown argument: `family'
However, this check is not done in kernel itself. When someone use
external netlink applications (pyroute2 python library for example), one
can create hash:mac with invalid family and inconsistant results from
userspace (`ipset` command cannot read set content anymore).
This patch enforce the logic in kernel, and forbids insertion of
hash:mac with a family set.
Since IP_SET_PROTO_UNDEF is defined only for hash:mac, this patch has no
impact on other hash:* sets
Signed-off-by: Florent Fourcot <florent.fourcot@wifirst.fr>
Signed-off-by: Victorien Molle <victorien.molle@wifirst.fr>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Due to the negative value condition in msecs_to_jiffies(), the real
max possible timeout value must be set to (UINT_MAX >> 1)/MSEC_PER_SEC.
Neutron Soutmun proposed the proper fix, but an insufficient one was
applied, see https://patchwork.ozlabs.org/patch/400405/.
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Inserting rule before one with SET target we get error with warning in
dmesg(1) output:
# iptables -A FORWARD -t mangle -j SET --map-set test src --map-prio
# iptables -I FORWARD 1 -t mangle -j ACCEPT
iptables: Invalid argument. Run `dmesg' for more information.
# dmesg |tail -n1
[268578.026643] mapping of prio or/and queue is allowed only from \
OUTPUT/FORWARD/POSTROUTING chains
Rather than checking for supported hook bits for SET target check for
unsupported one as done in all rest of matches and targets.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
On arm64, ebt_entry_{match,watcher,target} structs are 40 bytes long
while on 32-bit arm these structs have a size of 36 bytes.
COMPAT_XT_ALIGN() macro cannot be used here to determine the necessary
padding for the CONFIG_COMPAT because it imposes an 8-byte boundary
alignment, condition that is not found in 32-bit ebtables application.
Signed-off-by: Alin Nastac <alin.nastac@gmail.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
There is mistake in the rt_mode_allow_non_local assignment.
It should be used to check if sending to non-local addresses is
allowed, now it checks if local addresses are allowed.
As local addresses are allowed for most of the cases, the only
places that are affected are for traffic to transparent cache
servers:
- bypass connections when cache server is not available
- related ICMP in FORWARD hook when sent to cache server
Fixes: 4a4739d56b ("ipvs: Pull out crosses_local_route_boundary logic")
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
In order to allocate icmpv6 skb, sizeof(struct ipv6hdr) should be used.
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
ip_vs_ftp requires conntrack modules for mangling
of FTP command responses in passive mode.
Make sure the conntrack hooks are registered when
real servers use NAT method in FTP virtual service.
The hooks will be registered while the service is
present.
Fixes: 0c66dc1ea3 ("netfilter: conntrack: register hooks in netns when needed by ruleset")
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The helper and timeout strings are from user-space, we need to make
sure they are null terminated. If not, evil user could make kernel
read the unexpected memory, even print it when fail to find by the
following codes.
pr_info_ratelimited("No such helper \"%s\"\n", helper_name);
Signed-off-by: Gao Feng <gfree.wind@vip.163.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch reorders the error cases in showing the XPS configuration so
that we hold off on memory allocation until after we have verified that we
can support XPS on a given ring.
Fixes: 184c449f91 ("net: Add support for XPS with QoS via traffic classes")
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With CONFIG_CC_STACKPROTECTOR enabled the kernel panics as below when
parsing a NCSI_CMD_PKG_INFO command:
[ 150.149711] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: 805cff08
[ 150.149711]
[ 150.159919] CPU: 0 PID: 1301 Comm: ncsi-netlink Not tainted 4.13.16-468cbec6d2c91239332cb91b1f0a73aafcb6f0c6 #1
[ 150.170004] Hardware name: Generic DT based system
[ 150.174852] [<80109930>] (unwind_backtrace) from [<80106bc4>] (show_stack+0x20/0x24)
[ 150.182641] [<80106bc4>] (show_stack) from [<805d36e4>] (dump_stack+0x20/0x28)
[ 150.189888] [<805d36e4>] (dump_stack) from [<801163ac>] (panic+0xdc/0x278)
[ 150.196780] [<801163ac>] (panic) from [<801162cc>] (__stack_chk_fail+0x20/0x24)
[ 150.204111] [<801162cc>] (__stack_chk_fail) from [<805cff08>] (ncsi_pkg_info_all_nl+0x244/0x258)
[ 150.212912] [<805cff08>] (ncsi_pkg_info_all_nl) from [<804f939c>] (genl_lock_dumpit+0x3c/0x54)
[ 150.221535] [<804f939c>] (genl_lock_dumpit) from [<804f873c>] (netlink_dump+0xf8/0x284)
[ 150.229550] [<804f873c>] (netlink_dump) from [<804f8d44>] (__netlink_dump_start+0x124/0x17c)
[ 150.237992] [<804f8d44>] (__netlink_dump_start) from [<804f9880>] (genl_rcv_msg+0x1c8/0x3d4)
[ 150.246440] [<804f9880>] (genl_rcv_msg) from [<804f9174>] (netlink_rcv_skb+0xd8/0x134)
[ 150.254361] [<804f9174>] (netlink_rcv_skb) from [<804f96a4>] (genl_rcv+0x30/0x44)
[ 150.261850] [<804f96a4>] (genl_rcv) from [<804f7790>] (netlink_unicast+0x198/0x234)
[ 150.269511] [<804f7790>] (netlink_unicast) from [<804f7ffc>] (netlink_sendmsg+0x368/0x3b0)
[ 150.277783] [<804f7ffc>] (netlink_sendmsg) from [<804abea4>] (sock_sendmsg+0x24/0x34)
[ 150.285625] [<804abea4>] (sock_sendmsg) from [<804ac1dc>] (___sys_sendmsg+0x244/0x260)
[ 150.293556] [<804ac1dc>] (___sys_sendmsg) from [<804ad98c>] (__sys_sendmsg+0x5c/0x9c)
[ 150.301400] [<804ad98c>] (__sys_sendmsg) from [<804ad9e4>] (SyS_sendmsg+0x18/0x1c)
[ 150.308984] [<804ad9e4>] (SyS_sendmsg) from [<80102640>] (ret_fast_syscall+0x0/0x3c)
[ 150.316743] ---[ end Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: 805cff08
This turns out to be because the attrs array in ncsi_pkg_info_all_nl()
is initialised to a length of NCSI_ATTR_MAX which is the maximum
attribute number, not the number of attributes.
Fixes: 955dc68cb9 ("net/ncsi: Add generic netlink family")
Signed-off-by: Samuel Mendoza-Jonas <sam@mendozajonas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When we fail to modify a rule, we incorrectly release the idr handle
of the unmodified old rule.
Fix that by checking if we need to release it.
Fixes: fe2502e49b ("net_sched: remove cls_flower idr on failure")
Reported-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso says:
====================
Netfilter/IPVS fixes for net
The following patchset contains Netfilter/IPVS fixes for your net tree:
1) Null pointer dereference when dumping conntrack helper configuration,
from Taehee Yoo.
2) Missing sanitization in ebtables extension name through compat,
from Paolo Abeni.
3) Broken fetch of tracing value, from Taehee Yoo.
4) Incorrect arithmetics in packet ratelimiting.
5) Buffer overflow in IPVS sync daemon, from Julian Anastasov.
6) Wrong argument to nla_strlcpy() in nfnetlink_{acct,cthelper},
from Eric Dumazet.
7) Fix splat in nft_update_chain_stats().
8) Null pointer dereference from object netlink dump path, from
Taehee Yoo.
9) Missing static_branch_inc() when enabling counters in existing
chain, from Taehee Yoo.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes the following splat.
[118709.054937] BUG: using smp_processor_id() in preemptible [00000000] code: test/1571
[118709.054970] caller is nft_update_chain_stats.isra.4+0x53/0x97 [nf_tables]
[118709.054980] CPU: 2 PID: 1571 Comm: test Not tainted 4.17.0-rc6+ #335
[...]
[118709.054992] Call Trace:
[118709.055011] dump_stack+0x5f/0x86
[118709.055026] check_preemption_disabled+0xd4/0xe4
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pull networking fixes from David Miller:
"Let's begin the holiday weekend with some networking fixes:
1) Whoops need to restrict cfg80211 wiphy names even more to 64
bytes. From Eric Biggers.
2) Fix flags being ignored when using kernel_connect() with SCTP,
from Xin Long.
3) Use after free in DCCP, from Alexey Kodanev.
4) Need to check rhltable_init() return value in ipmr code, from Eric
Dumazet.
5) XDP handling fixes in virtio_net from Jason Wang.
6) Missing RTA_TABLE in rtm_ipv4_policy[], from Roopa Prabhu.
7) Need to use IRQ disabling spinlocks in mlx4_qp_lookup(), from Jack
Morgenstein.
8) Prevent out-of-bounds speculation using indexes in BPF, from
Daniel Borkmann.
9) Fix regression added by AF_PACKET link layer cure, from Willem de
Bruijn.
10) Correct ENIC dma mask, from Govindarajulu Varadarajan.
11) Missing config options for PMTU tests, from Stefano Brivio"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (48 commits)
ibmvnic: Fix partial success login retries
selftests/net: Add missing config options for PMTU tests
mlx4_core: allocate ICM memory in page size chunks
enic: set DMA mask to 47 bit
ppp: remove the PPPIOCDETACH ioctl
ipv4: remove warning in ip_recv_error
net : sched: cls_api: deal with egdev path only if needed
vhost: synchronize IOTLB message with dev cleanup
packet: fix reserve calculation
net/mlx5: IPSec, Fix a race between concurrent sandbox QP commands
net/mlx5e: When RXFCS is set, add FCS data into checksum calculation
bpf: properly enforce index mask to prevent out-of-bounds speculation
net/mlx4: Fix irq-unsafe spinlock usage
net: phy: broadcom: Fix bcm_write_exp()
net: phy: broadcom: Fix auxiliary control register reads
net: ipv4: add missing RTA_TABLE to rtm_ipv4_policy
net/mlx4: fix spelling mistake: "Inrerface" -> "Interface" and rephrase message
ibmvnic: Only do H_EOI for mobility events
tuntap: correctly set SOCKWQ_ASYNC_NOSPACE
virtio-net: fix leaking page for gso packet during mergeable XDP
...
- prevent hardif_put call with NULL parameter, by Colin Ian King
- Avoid race in Translation Table allocator, by Sven Eckelmann
- Fix Translation Table sync flags for intermediate Responses,
by Linus Luessing
- prevent sending inconsistent Translation Table TVLVs,
by Marek Lindner
-----BEGIN PGP SIGNATURE-----
iQJKBAABCgA0FiEE1ilQI7G+y+fdhnrfoSvjmEKSnqEFAlsGpzQWHHN3QHNpbW9u
d3VuZGVybGljaC5kZQAKCRChK+OYQpKeocg6D/420GIZ7fJf5xadOzw9XYDbU+tc
nSz/N7V40BGit5i+bTo1fvQ9KRw4amUgx5nrTGi+nFmm0Au3cvwtFQe/c9JWZjFj
/7LWsc/HJPjhS045qwbnwBNBrlEM0KbEwqibmKgfrwVf1uIlUueMuILOFrQtAMhE
WokWK0mffxwPhwFlLxcByzqJSIUPB7BIoYnobU6orqjECdUVpUs3u4AF4Hbk28Y1
P3HS2WCK+Xgf1My7yjahh1pf1gM/AD0y+Xf+1k7dfM+n3qZrhdAPx30JmAIbcNWG
T4uSarxXjio9RRV9Fy1DIggpi1oCCCGAlayTg6RwyYIv5qYB6L7dPrBlRx+gOtox
QpLWIczvdMvloYFV5QykviErZRutFVBF2LKOEOI0GAGU/mCal2GNd+K1V504z+Gq
PM/5SmuAPOmCXe1rf/72zpkFSspZ2h4xvPjYpdNRyBRg/QWHqz+JWMGeENtKoRYV
v+3wYJTEa1bdeoc5NH7Fnc2iudeNlmISDuMxon1ikkgh5jHbaaC8Jvc6inp8y/JW
hhvYd6A406H/N6WOn0FbbSuZc1DQFO/92Fib4N2zdOwBFY2adLTpjAayv/ePpMji
SnGUdxoinWOUWpQo4S4SX4PXdE6F6ylNU4/Z2k9ixyQxNWO3T9wbKPaQGfLl8iSn
P9wPkGr84R2tlXi21Q==
=KY0V
-----END PGP SIGNATURE-----
Merge tag 'batadv-net-for-davem-20180524' of git://git.open-mesh.org/linux-merge
Simon Wunderlich says:
====================
Here are some batman-adv bugfixes:
- prevent hardif_put call with NULL parameter, by Colin Ian King
- Avoid race in Translation Table allocator, by Sven Eckelmann
- Fix Translation Table sync flags for intermediate Responses,
by Linus Luessing
- prevent sending inconsistent Translation Table TVLVs,
by Marek Lindner
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
A precondition check in ip_recv_error triggered on an otherwise benign
race. Remove the warning.
The warning triggers when passing an ipv6 socket to this ipv4 error
handling function. RaceFuzzer was able to trigger it due to a race
in setsockopt IPV6_ADDRFORM.
---
CPU0
do_ipv6_setsockopt
sk->sk_socket->ops = &inet_dgram_ops;
---
CPU1
sk->sk_prot->recvmsg
udp_recvmsg
ip_recv_error
WARN_ON_ONCE(sk->sk_family == AF_INET6);
---
CPU0
do_ipv6_setsockopt
sk->sk_family = PF_INET;
This socket option converts a v6 socket that is connected to a v4 peer
to an v4 socket. It updates the socket on the fly, changing fields in
sk as well as other structs. This is inherently non-atomic. It races
with the lockless udp_recvmsg path.
No other code makes an assumption that these fields are updated
atomically. It is benign here, too, as ip_recv_error cares only about
the protocol of the skbs enqueued on the error queue, for which
sk_family is not a precise predictor (thanks to another isue with
IPV6_ADDRFORM).
Link: http://lkml.kernel.org/r/20180518120826.GA19515@dragonet.kaist.ac.kr
Fixes: 7ce875e5ec ("ipv4: warn once on passing AF_INET6 socket to ip_recv_error")
Reported-by: DaeRyong Jeong <threeearcat@gmail.com>
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When dealing with ingress rule on a netdev, if we did fine through the
conventional path, there's no need to continue into the egdev route,
and we can stop right there.
Not doing so may cause a 2nd rule to be added by the cls api layer
with the ingress being the egdev.
For example, under sriov switchdev scheme, a user rule of VFR A --> VFR B
will end up with two HW rules (1) VF A --> VF B and (2) uplink --> VF B
Fixes: 208c0f4b52 ('net: sched: use tc_setup_cb_call to call per-block callbacks')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit b84bbaf7a6 ("packet: in packet_snd start writing at link
layer allocation") ensures that packet_snd always starts writing
the link layer header in reserved headroom allocated for this
purpose.
This is needed because packets may be shorter than hard_header_len,
in which case the space up to hard_header_len may be zeroed. But
that necessary padding is not accounted for in skb->len.
The fix, however, is buggy. It calls skb_push, which grows skb->len
when moving skb->data back. But in this case packet length should not
change.
Instead, call skb_reserve, which moves both skb->data and skb->tail
back, without changing length.
Fixes: b84bbaf7a6 ("packet: in packet_snd start writing at link layer allocation")
Reported-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Remove bouncing addresses from the MAINTAINERS file
- Kernel oops and bad error handling fixes for hfi, i40iw, cxgb4, and hns drivers
- Various small LOC behavioral/operational bugs in mlx5, hns, qedr and i40iw drivers
- Two fixes for patches already sent during the merge window
- A long standing bug related to not decreasing the pinned pages count in the right
MM was found and fixed
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCgAGBQJbByPQAAoJEDht9xV+IJsa164P/AihB/vbn9MBdK3pe1OSUGTm
tKZJ/Y6nY/Q/XTJSeM2wNECk8fOrZbKuLBz2XlPRsB2djp4ugC5WWfK9YbwWMGXG
I5B/lB8VTorQr8E5i9lqqMDQc8aF8VcGJtdqVE3nD4JsVTrQSGiSnw45/BARDUm3
OycJJMDOWhDj2wnNSa+JfjPemIMDM1jse7DnsJfDsGfTMS/G+6nyzjKIlEnnFZ8/
PBxhq0q7C5viNDwwn2GsAVUrATTlW48SY0WYhkgMdSl20d2th9wMZqNMqtniz8NP
lg87SrhzsAPOTlbSWlYYkAnzE7nEhfJyIfYUp2piNJeYuOohYPtO6w99Tqjl/GmU
uLIYIXtZCxAK1Zb/znc49HkRVL5YFDsQGXdtYy7tvRZPwwR32kowUtpKIWaZFz8O
BA/x+Zgqu9AlwqSWwQwxmMbUX42RRwhNJDVyTYlXQSSzhfgFaLIZARqb4K6HxeNN
vZN0BK+x6pX6FI7hpdsqNRtH1oo4SNUBxiuUsrZ7cy7GqYNdUJ6piygDgmERaJxU
svIUJof/+OoU1QyErQ0JgUEK/3jOHbjxSPb/rjQeqxAnCqhaGOuNGMtdfsGqgvBU
x/u3eDcbfi/LBErXR46gYtxnOQ8I2BB+m8erUc/GVvCzWrX+R7ELZYpBrP5Pcu/6
mr2D7hDqgZHbeU8aB8+D
=uFZh
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma fixes from Jason Gunthorpe:
"This is pretty much just the usual array of smallish driver bugs.
- remove bouncing addresses from the MAINTAINERS file
- kernel oops and bad error handling fixes for hfi, i40iw, cxgb4, and
hns drivers
- various small LOC behavioral/operational bugs in mlx5, hns, qedr
and i40iw drivers
- two fixes for patches already sent during the merge window
- a long-standing bug related to not decreasing the pinned pages
count in the right MM was found and fixed"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (28 commits)
RDMA/hns: Move the location for initializing tmp_len
RDMA/hns: Bugfix for cq record db for kernel
IB/uverbs: Fix uverbs_attr_get_obj
RDMA/qedr: Fix doorbell bar mapping for dpi > 1
IB/umem: Use the correct mm during ib_umem_release
iw_cxgb4: Fix an error handling path in 'c4iw_get_dma_mr()'
RDMA/i40iw: Avoid panic when reading back the IRQ affinity hint
RDMA/i40iw: Avoid reference leaks when processing the AEQ
RDMA/i40iw: Avoid panic when objects are being created and destroyed
RDMA/hns: Fix the bug with NULL pointer
RDMA/hns: Set NULL for __internal_mr
RDMA/hns: Enable inner_pa_vld filed of mpt
RDMA/hns: Set desc_dma_addr for zero when free cmq desc
RDMA/hns: Fix the bug with rq sge
RDMA/hns: Not support qp transition from reset to reset for hip06
RDMA/hns: Add return operation when configured global param fail
RDMA/hns: Update convert function of endian format
RDMA/hns: Load the RoCE dirver automatically
RDMA/hns: Bugfix for rq record db for kernel
RDMA/hns: Add rq inline flags judgement
...
* hwsim radio dump wasn't working for the first radio
* mesh was updating statistics incorrectly
* a netlink message allocation was possibly too short
* wiphy name limit was still too long
* in certain cases regdb query could find a NULL pointer
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEH1e1rEeCd0AIMq6MB8qZga/fl8QFAlsFNRQACgkQB8qZga/f
l8RTBw//Txtrv6BHZ5VHaibMwFfXB9UIuhzogcuUYzxF1qXzF4l4N2GehUGdlKPy
pjNGwYbqtD1b/mCa/BSGAHcuHXSQNmVRVOv3Vjvb6XtAPTVXiQcYWYqRA+F90IcL
gpfrl1RQKHGoZ8S/DST8YEtzgEog9hRr/WvnOCphVqnDohUM5UIv2iPF8Vp5ylQ8
cIwVa7/QgPJ5vG8EE7aPJTHnga9kNRqvlIAODq8H5QwwFTUPP431AjUHK7/nL/+l
GQF1CXA2mQldPowsNK82vS8guaIykD3wxLeuWBiHCa7EExmX5eA5NlySvvAwd1bE
2fM4/vTA5X0jNzqIYzVqZS4rbEHu7h/Kcm1QWCydl0xeKxONO0nfJj+AdWUBE2YH
g9CHEqnChIJgw43kXbN2E2WflDRL+k4yjQlvhbWIcr9yk/8pdO4mkD6qpwGbBQsn
kyeWbhB58M7IGAkTqrx9FeK7rCnPO5SZGFRD7Rpou0S5ioP7ce/xvkv5gvYN3Knu
OUsQv42mwY23cx2XEyeJ/4pXMaihw1lRyiTHSgpJjha2XdmlvvfPu5/pJcSft4jt
weQTot6ugCGimyBx/5bsJIczuHMVE1pD9ctjty5I0Xxfv/Qj78HSwFlB3RKWsFnG
fzksSC1ik5OSYBOi6vFiRO10EUBlVjNXPhP8x5LnAkBWwHp5juM=
=KWsp
-----END PGP SIGNATURE-----
Merge tag 'mac80211-for-davem-2018-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
A handful of fixes:
* hwsim radio dump wasn't working for the first radio
* mesh was updating statistics incorrectly
* a netlink message allocation was possibly too short
* wiphy name limit was still too long
* in certain cases regdb query could find a NULL pointer
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Credit calculations for the packet ratelimiting are not correct, as per
the applied ratelimit of 25/second and burst 8, a total of 33 packets
should have been accepted. This is true in iptables(33) but not in
nftables (~65). For packet ratelimiting, use:
div_u64(limit->nsecs, limit->rate) * limit->burst;
to calculate credit, just like in iptables' xt_limit does.
Moreover, use default burst in iptables, users are expecting similar
behaviour.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
In the nft_meta_set_eval, nftrace value is dereferenced as u32 from sreg.
But correct type is u8. so that sometimes incorrect value is dereferenced.
Steps to reproduce:
%nft add table ip filter
%nft add chain ip filter input { type filter hook input priority 4\; }
%nft add rule ip filter input nftrace set 0
%nft monitor
Sometimes, we can see trace messages.
trace id 16767227 ip filter input packet: iif "enp2s0"
ether saddr xx:xx:xx:xx:xx:xx ether daddr xx:xx:xx:xx:xx:xx
ip saddr 192.168.0.1 ip daddr 255.255.255.255 ip dscp cs0
ip ecn not-ect ip
trace id 16767227 ip filter input rule nftrace set 0 (verdict continue)
trace id 16767227 ip filter input verdict continue
trace id 16767227 ip filter input
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Now sctp uses inet_dgram_connect as its proto_ops .connect, and the flags
param can't be passed into its proto .connect where this flags is really
needed.
sctp works around it by getting flags from socket file in __sctp_connect.
It works for connecting from userspace, as inherently the user sock has
socket file and it passes f_flags as the flags param into the proto_ops
.connect.
However, the sock created by sock_create_kern doesn't have a socket file,
and it passes the flags (like O_NONBLOCK) by using the flags param in
kernel_connect, which calls proto_ops .connect later.
So to fix it, this patch defines a new proto_ops .connect for sctp,
sctp_inet_connect, which calls __sctp_connect() directly with this
flags param. After this, the sctp's proto .connect can be removed.
Note that sctp_inet_connect doesn't need to do some checks that are not
needed for sctp, which makes thing better than with inet_dgram_connect.
Suggested-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reviewed-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some drivers may call this function when regdb is not initialized yet,
so we need to make sure regdb is valid before trying to access it.
Make sure regdb is initialized before trying to access it in
reg_query_regdb_wmm() and query_regdb().
Reported-by: Eric Biggers <ebiggers3@gmail.com>
Signed-off-by: Haim Dreyfuss <haim.dreyfuss@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Currently ip6gre and ip6erspan share single metadata mode device,
using 'collect_md_tun'. Thus, when doing:
ip link add dev ip6gre11 type ip6gretap external
ip link add dev ip6erspan12 type ip6erspan external
RTNETLINK answers: File exists
simply fails due to the 2nd tries to create the same collect_md_tun.
The patch fixes it by adding a separate collect md tunnel device
for the ip6erspan, 'collect_md_tun_erspan'. As a result, a couple
of places need to refactor/split up in order to distinguish ip6gre
and ip6erspan.
First, move the collect_md check at ip6gre_tunnel_{unlink,link} and
create separate function {ip6gre,ip6ersapn}_tunnel_{link_md,unlink_md}.
Then before link/unlink, make sure the link_md/unlink_md is called.
Finally, a separate ndo_uninit is created for ip6erspan. Tested it
using the samples/bpf/test_tunnel_bpf.sh.
Fixes: ef7baf5e08 ("ip6_gre: add ip6 erspan collect_md mode")
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Even if commit 1d27732f41 ("net: dsa: setup and teardown ports") indicated
that registering a devlink instance for unused ports is not a problem, and this
is true, this can be confusing nonetheless, so let's not do it.
Fixes: 1d27732f41 ("net: dsa: setup and teardown ports")
Reported-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
While removing queues from the XPS map, the individual CPU ID
alone was used to index the CPUs map, this should be changed to also
factor in the traffic class mapping for the CPU-to-queue lookup.
Fixes: 184c449f91 ("net: Add support for XPS with QoS via traffic classes")
Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allocation size of nlmsg in cfg80211_ft_event is based on ric_ies_len
and doesn't take into account ies_len. This leads to
NL80211_CMD_FT_EVENT message construction failure in case ft_event
contains large enough ies buffer.
Add ies_len to the nlmsg allocation size.
Signed-off-by: Dedy Lansky <dlansky@codeaurora.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Daniel Borkmann says:
====================
pull-request: bpf 2018-05-18
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) Fix two bugs in sockmap, a use after free in sockmap's error path
from sock_map_ctx_update_elem() where we mistakenly drop a reference
we didn't take prior to that, and in the same function fix a race
in bpf_prog_inc_not_zero() where we didn't use the progs from prior
READ_ONCE(), from John.
2) Reject program expansions once we figure out that their jump target
which crosses patchlet boundaries could otherwise get truncated in
insn->off space, from Daniel.
3) Check the return value of fopen() in BPF selftest's test_verifier
where we determine whether unpriv BPF is disabled, and iff we do
fail there then just assume it is disabled. This fixes a segfault
when used with older kernels, from Jesper.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Recently during testing, I ran into the following panic:
[ 207.892422] Internal error: Accessing user space memory outside uaccess.h routines: 96000004 [#1] SMP
[ 207.901637] Modules linked in: binfmt_misc [...]
[ 207.966530] CPU: 45 PID: 2256 Comm: test_verifier Tainted: G W 4.17.0-rc3+ #7
[ 207.974956] Hardware name: FOXCONN R2-1221R-A4/C2U4N_MB, BIOS G31FB18A 03/31/2017
[ 207.982428] pstate: 60400005 (nZCv daif +PAN -UAO)
[ 207.987214] pc : bpf_skb_load_helper_8_no_cache+0x34/0xc0
[ 207.992603] lr : 0xffff000000bdb754
[ 207.996080] sp : ffff000013703ca0
[ 207.999384] x29: ffff000013703ca0 x28: 0000000000000001
[ 208.004688] x27: 0000000000000001 x26: 0000000000000000
[ 208.009992] x25: ffff000013703ce0 x24: ffff800fb4afcb00
[ 208.015295] x23: ffff00007d2f5038 x22: ffff00007d2f5000
[ 208.020599] x21: fffffffffeff2a6f x20: 000000000000000a
[ 208.025903] x19: ffff000009578000 x18: 0000000000000a03
[ 208.031206] x17: 0000000000000000 x16: 0000000000000000
[ 208.036510] x15: 0000ffff9de83000 x14: 0000000000000000
[ 208.041813] x13: 0000000000000000 x12: 0000000000000000
[ 208.047116] x11: 0000000000000001 x10: ffff0000089e7f18
[ 208.052419] x9 : fffffffffeff2a6f x8 : 0000000000000000
[ 208.057723] x7 : 000000000000000a x6 : 00280c6160000000
[ 208.063026] x5 : 0000000000000018 x4 : 0000000000007db6
[ 208.068329] x3 : 000000000008647a x2 : 19868179b1484500
[ 208.073632] x1 : 0000000000000000 x0 : ffff000009578c08
[ 208.078938] Process test_verifier (pid: 2256, stack limit = 0x0000000049ca7974)
[ 208.086235] Call trace:
[ 208.088672] bpf_skb_load_helper_8_no_cache+0x34/0xc0
[ 208.093713] 0xffff000000bdb754
[ 208.096845] bpf_test_run+0x78/0xf8
[ 208.100324] bpf_prog_test_run_skb+0x148/0x230
[ 208.104758] sys_bpf+0x314/0x1198
[ 208.108064] el0_svc_naked+0x30/0x34
[ 208.111632] Code: 91302260 f9400001 f9001fa1 d2800001 (29500680)
[ 208.117717] ---[ end trace 263cb8a59b5bf29f ]---
The program itself which caused this had a long jump over the whole
instruction sequence where all of the inner instructions required
heavy expansions into multiple BPF instructions. Additionally, I also
had BPF hardening enabled which requires once more rewrites of all
constant values in order to blind them. Each time we rewrite insns,
bpf_adj_branches() would need to potentially adjust branch targets
which cross the patchlet boundary to accommodate for the additional
delta. Eventually that lead to the case where the target offset could
not fit into insn->off's upper 0x7fff limit anymore where then offset
wraps around becoming negative (in s16 universe), or vice versa
depending on the jump direction.
Therefore it becomes necessary to detect and reject any such occasions
in a generic way for native eBPF and cBPF to eBPF migrations. For
the latter we can simply check bounds in the bpf_convert_filter()'s
BPF_EMIT_JMP helper macro and bail out once we surpass limits. The
bpf_patch_insn_single() for native eBPF (and cBPF to eBPF in case
of subsequent hardening) is a bit more complex in that we need to
detect such truncations before hitting the bpf_prog_realloc(). Thus
the latter is split into an extra pass to probe problematic offsets
on the original program in order to fail early. With that in place
and carefully tested I no longer hit the panic and the rewrites are
rejected properly. The above example panic I've seen on bpf-next,
though the issue itself is generic in that a guard against this issue
in bpf seems more appropriate in this case.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Device features may change during transmission. In particular with
corking, a device may toggle scatter-gather in between allocating
and writing to an skb.
Do not unconditionally assume that !NETIF_F_SG at write time implies
that the same held at alloc time and thus the skb has sufficient
tailroom.
This issue predates git history.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Even though ip6erspan_tap_init() sets up hlen and tun_hlen according to
what ERSPAN needs, it goes ahead to call ip6gre_tnl_link_config() which
overwrites these settings with GRE-specific ones.
Similarly for changelink callbacks, which are handled by
ip6gre_changelink() calls ip6gre_tnl_change() calls
ip6gre_tnl_link_config() as well.
The difference ends up being 12 vs. 20 bytes, and this is generally not
a problem, because a 12-byte request likely ends up allocating more and
the extra 8 bytes are thus available. However correct it is not.
So replace the newlink and changelink callbacks with an ERSPAN-specific
ones, reusing the newly-introduced _common() functions.
Fixes: 5a963eb61b ("ip6_gre: Add ERSPAN native tunnel support")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Extract from ip6gre_changelink() a reusable function
ip6gre_changelink_common(). This will allow introduction of
ERSPAN-specific _changelink() function with not a lot of code
duplication.
Fixes: 5a963eb61b ("ip6_gre: Add ERSPAN native tunnel support")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Extract from ip6gre_newlink() a reusable function
ip6gre_newlink_common(). The ip6gre_tnl_link_config() call needs to be
made customizable for ERSPAN, thus reorder it with calls to
ip6_tnl_change_mtu() and dev_hold(), and extract the whole tail to the
caller, ip6gre_newlink(). Thus enable an ERSPAN-specific _newlink()
function without a lot of duplicity.
Fixes: 5a963eb61b ("ip6_gre: Add ERSPAN native tunnel support")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>