linux-hardened

Author	SHA1	Message	Date
Jakub Pawlak	fdd4044ffd	iavf: Remove timer for work triggering, use delaying work instead Remove the watchdog timer, instead declare watchdog task as delayed work and use dedicated workqueue to service driver tasks. The dedicated driver workqueue iavf_wq is common for all driver instances. Signed-off-by: Jakub Pawlak <jakub.pawlak@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-06-17 15:39:26 -07:00
Jakub Pawlak	b476b0030e	iavf: Move commands processing to the separate function Move the commands processing outside the watchdog_task() function. This reduce length and complexity of the function which is mainly designed to process the watchdog state machine. Signed-off-by: Jakub Pawlak <jakub.pawlak@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-06-17 15:39:26 -07:00
Avinash Dayanand	16e00c25ac	iavf: Fix the math for valid length for ADq enable There was a calculation error in virtchnl regarding the valid length which was fixed recently and a corresponding change needs to go into the code while we enable ADq. Signed-off-by: Avinash Dayanand <avinash.dayanand@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-06-17 15:39:26 -07:00
Aleksandr Loktionov	f0a48fb441	iavf: Change GFP_KERNEL to GFP_ATOMIC in kzalloc() iavf_add_vlan() is being called in atomic context so kzalloc() needs GFP_ATOMIC. This patch fixes it. Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-06-17 15:39:25 -07:00
Mitch Williams	88ec7308ea	iavf: wait longer for close to complete On some hardware/driver/architecture combinations, it may take longer than 200msec for all close operations to be completed, causing a spurious error message to be logged. Increase the timeout value to 500msec to avoid this erroneous error. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-06-17 15:39:25 -07:00
Mitch Williams	168d91cf2a	iavf: use signed variable The counter variable in iavf_clean_tx_irq starts out negative and climbs to 0. So allocating it as u16 is actually a really bad idea that just happens to work because the value underflows and overflows consistently on most architectures. Replace the u16 with an int so signed math works as expected. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-06-17 15:39:25 -07:00
Akeem G Abodunrin	c2417a7b0e	iavf: Create VLAN tag elements starting from the first element This patch changes how VLAN tag are being populated and programmed into the HW - Instead of start adding VF VLAN tag from the last member of the element list, start from the first member of the list, until number of allowed VLAN tags is exhausted in the HW. Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-06-17 15:39:25 -07:00
Daniel T. Lee	4d18f6de6a	samples: bpf: refactor header include path Currently, header inclusion in each file is inconsistent. For example, "libbpf.h" header is included as multiple ways. #include "bpf/libbpf.h" #include "libbpf.h" Due to commit `b552d33c80` ("samples/bpf: fix include path in Makefile"), $(srctree)/tools/lib/bpf/ path had been included during build, path "bpf/" in header isn't necessary anymore. This commit removes path "bpf/" in header inclusion. Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-06-18 00:28:36 +02:00
Daniel T. Lee	fa206dccd8	samples: bpf: remove unnecessary include options in Makefile Due to recent change of include path at commit `b552d33c80` ("samples/bpf: fix include path in Makefile"), some of the previous include options became unnecessary. This commit removes duplicated include options in Makefile. Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-06-18 00:28:36 +02:00
Daniel Borkmann	32b88d3743	Merge branch 'bpf-libbpf-btf-defined-maps' Andrii Nakryiko says: ==================== This patch set implements initial version (as discussed at LSF/MM2019 conference) of a new way to specify BPF maps, relying on BTF type information, which allows for easy extensibility, preserving forward and backward compatibility. See details and examples in description for patch #6. [0] contains an outline of follow up extensions to be added after this basic set of features lands. They are useful by itself, but also allows to bring libbpf to feature-parity with iproute2 BPF loader. That should open a path forward for BPF loaders unification. Patch #1 centralizes commonly used min/max macro in libbpf_internal.h. Patch #2 extracts .BTF and .BTF.ext loading loging from elf_collect(). Patch #3 simplifies elf_collect() error-handling logic. Patch #4 refactors map initialization logic into user-provided maps and global data maps, in preparation to adding another way (BTF-defined maps). Patch #5 adds support for map definitions in multiple ELF sections and deprecates bpf_object__find_map_by_offset() API which doesn't appear to be used anymore and makes assumption that all map definitions reside in single ELF section. Patch #6 splits BTF intialization from sanitization/loading into kernel to preserve original BTF at the time of map initialization. Patch #7 adds support for BTF-defined maps. Patch #8 adds new test for BTF-defined map definition. Patches #9-11 convert test BPF map definitions to use BTF way. [0] https://lore.kernel.org/bpf/CAEf4BzbfdG2ub7gCi0OYqBrUoChVHWsmOntWAkJt47=FE+km+A@mail.gmail.com/ v1->v2: - more BTF-sanity checks in parsing map definitions (Song); - removed confusing usage of "attribute", switched to "field; - split off elf_collect() refactor from btf loading refactor (Song); - split selftests conversion into 3 patches (Stanislav): 1. test already relying on BTF; 2. tests w/ custom types as key/value (so benefiting from BTF); 3. all the rest tests (integers as key/value, special maps w/o BTF support). - smaller code improvements (Song); rfc->v1: - error out on unknown field by default (Stanislav, Jakub, Lorenz); ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-06-18 00:11:40 +02:00
Andrii Nakryiko	df0b779259	selftests/bpf: convert tests w/ custom values to BTF-defined maps Convert a bulk of selftests that have maps with custom (not integer) key and/or value. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-06-18 00:10:43 +02:00
Andrii Nakryiko	f654407481	selftests/bpf: switch BPF_ANNOTATE_KV_PAIR tests to BTF-defined maps Switch tests that already rely on BTF to BTF-defined map definitions. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-06-18 00:10:43 +02:00
Andrii Nakryiko	9e3d709c47	selftests/bpf: add test for BTF-defined maps Add file test for BTF-defined map definition. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-06-18 00:10:42 +02:00
Andrii Nakryiko	abd29c9314	libbpf: allow specifying map definitions using BTF This patch adds support for a new way to define BPF maps. It relies on BTF to describe mandatory and optional attributes of a map, as well as captures type information of key and value naturally. This eliminates the need for BPF_ANNOTATE_KV_PAIR hack and ensures key/value sizes are always in sync with the key/value type. Relying on BTF, this approach allows for both forward and backward compatibility w.r.t. extending supported map definition features. By default, any unrecognized attributes are treated as an error, but it's possible relax this using MAPS_RELAX_COMPAT flag. New attributes, added in the future will need to be optional. The outline of the new map definition (short, BTF-defined maps) is as follows: 1. All the maps should be defined in .maps ELF section. It's possible to have both "legacy" map definitions in `maps` sections and BTF-defined maps in .maps sections. Everything will still work transparently. 2. The map declaration and initialization is done through a global/static variable of a struct type with few mandatory and extra optional fields: - type field is mandatory and specified type of BPF map; - key/value fields are mandatory and capture key/value type/size information; - max_entries attribute is optional; if max_entries is not specified or initialized, it has to be provided in runtime through libbpf API before loading bpf_object; - map_flags is optional and if not defined, will be assumed to be 0. 3. Key/value fields should be a pointer to a type describing key/value. The pointee type is assumed (and will be recorded as such and used for size determination) to be a type describing key/value of the map. This is done to save excessive amounts of space allocated in corresponding ELF sections for key/value of big size. 4. As some maps disallow having BTF type ID associated with key/value, it's possible to specify key/value size explicitly without associating BTF type ID with it. Use key_size and value_size fields to do that (see example below). Here's an example of simple ARRAY map defintion: struct my_value { int x, y, z; }; struct { int type; int max_entries; int key; struct my_value value; } btf_map SEC(".maps") = { .type = BPF_MAP_TYPE_ARRAY, .max_entries = 16, }; This will define BPF ARRAY map 'btf_map' with 16 elements. The key will be of type int and thus key size will be 4 bytes. The value is struct my_value of size 12 bytes. This map can be used from C code exactly the same as with existing maps defined through struct bpf_map_def. Here's an example of STACKMAP definition (which currently disallows BTF type IDs for key/value): struct { __u32 type; __u32 max_entries; __u32 map_flags; __u32 key_size; __u32 value_size; } stackmap SEC(".maps") = { .type = BPF_MAP_TYPE_STACK_TRACE, .max_entries = 128, .map_flags = BPF_F_STACK_BUILD_ID, .key_size = sizeof(__u32), .value_size = PERF_MAX_STACK_DEPTH * sizeof(struct bpf_stack_build_id), }; This approach is naturally extended to support map-in-map, by making a value field to be another struct that describes inner map. This feature is not implemented yet. It's also possible to incrementally add features like pinning with full backwards and forward compatibility. Support for static initialization of BPF_MAP_TYPE_PROG_ARRAY using pointers to BPF programs is also on the roadmap. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-06-18 00:10:41 +02:00
Andrii Nakryiko	063183bf04	libbpf: split initialization and loading of BTF Libbpf does sanitization of BTF before loading it into kernel, if kernel doesn't support some of newer BTF features. This removes some of the important information from BTF (e.g., DATASEC and VAR description), which will be used for map construction. This patch splits BTF processing into initialization step, in which BTF is initialized from ELF and all the original data is still preserved; and sanitization/loading step, which ensures that BTF is safe to load into kernel. This allows to use full BTF information to construct maps, while still loading valid BTF into older kernels. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-06-18 00:10:41 +02:00
Andrii Nakryiko	db48814bd2	libbpf: identify maps by section index in addition to offset To support maps to be defined in multiple sections, it's important to identify map not just by offset within its section, but section index as well. This patch adds tracking of section index. For global data, we record section index of corresponding .data/.bss/.rodata ELF section for uniformity, and thus don't need a special value of offset for those maps. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-06-18 00:10:40 +02:00
Andrii Nakryiko	bf82927125	libbpf: refactor map initialization User and global data maps initialization has gotten pretty complicated and unnecessarily convoluted. This patch splits out the logic for global data map and user-defined map initialization. It also removes the restriction of pre-calculating how many maps will be initialized, instead allowing to keep adding new maps as they are discovered, which will be used later for BTF-defined map definitions. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-06-18 00:10:39 +02:00
Andrii Nakryiko	01b29d1dc9	libbpf: streamline ELF parsing error-handling Simplify ELF parsing logic by exiting early, as there is no common clean up path to execute. That makes it unnecessary to track when err was set and when it was cleared. It also reduces nesting in some places. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-06-18 00:10:39 +02:00
Andrii Nakryiko	9c6660d040	libbpf: extract BTF loading logic As a preparation for adding BTF-based BPF map loading, extract .BTF and .BTF.ext loading logic. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-06-18 00:10:12 +02:00
Andrii Nakryiko	d7fe74f940	libbpf: add common min/max macro to libbpf_internal.h Multiple files in libbpf redefine their own definitions for min/max. Let's define them in libbpf_internal.h and use those everywhere. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-06-18 00:08:54 +02:00
Christian Brauner	d728cf7916	fs/namespace: fix unprivileged mount propagation When propagating mounts across mount namespaces owned by different user namespaces it is not possible anymore to move or umount the mount in the less privileged mount namespace. Here is a reproducer: sudo mount -t tmpfs tmpfs /mnt sudo --make-rshared /mnt # create unprivileged user + mount namespace and preserve propagation unshare -U -m --map-root --propagation=unchanged # now change back to the original mount namespace in another terminal: sudo mkdir /mnt/aaa sudo mount -t tmpfs tmpfs /mnt/aaa # now in the unprivileged user + mount namespace mount --move /mnt/aaa /opt Unfortunately, this is a pretty big deal for userspace since this is e.g. used to inject mounts into running unprivileged containers. So this regression really needs to go away rather quickly. The problem is that a recent change falsely locked the root of the newly added mounts by setting MNT_LOCKED. Fix this by only locking the mounts on copy_mnt_ns() and not when adding a new mount. Fixes: `3bd045cc9c` ("separate copying and locking mount tree on cross-userns copies") Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <stable@vger.kernel.org> Tested-by: Christian Brauner <christian@brauner.io> Acked-by: Christian Brauner <christian@brauner.io> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Christian Brauner <christian@brauner.io> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2019-06-17 17:36:09 -04:00
Eric Biggers	1b0b9cc8d3	vfs: fsmount: add missing mntget() sys_fsmount() needs to take a reference to the new mount when adding it to the anonymous mount namespace. Otherwise the filesystem can be unmounted while it's still in use, as found by syzkaller. Reported-by: Mark Rutland <mark.rutland@arm.com> Reported-by: syzbot+99de05d099a170867f22@syzkaller.appspotmail.com Reported-by: syzbot+7008b8b8ba7df475fdc8@syzkaller.appspotmail.com Fixes: `93766fbd26` ("vfs: syscall: Add fsmount() to create a mount for a superblock") Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2019-06-17 17:36:07 -04:00
Ronnie Sahlberg	61cabc7b0a	cifs: fix GlobalMid_Lock bug in cifs_reconnect We can not hold the GlobalMid_Lock spinlock during the dfs processing in cifs_reconnect since it invokes things that may sleep and thus trigger : BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:23 Thus we need to drop the spinlock during this code block. RHBZ: 1716743 Cc: stable@vger.kernel.org Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Acked-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2019-06-17 16:27:02 -05:00
Steve French	8d526d62db	SMB3: retry on STATUS_INSUFFICIENT_RESOURCES instead of failing write Some servers such as Windows 10 will return STATUS_INSUFFICIENT_RESOURCES as the number of simultaneous SMB3 requests grows (even though the client has sufficient credits). Return EAGAIN on STATUS_INSUFFICIENT_RESOURCES so that we can retry writes which fail with this status code. This (for example) fixes large file copies to Windows 10 on fast networks. Signed-off-by: Steve French <stfrench@microsoft.com> CC: Stable <stable@vger.kernel.org> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>	2019-06-17 16:17:56 -05:00
Jiri Pirko	f517f2716c	net: sched: cls_matchall: allow to delete filter Currently user is unable to delete the filter. See following example: $ tc filter add dev ens16np1 ingress pref 1 handle 1 matchall action drop $ tc filter show dev ens16np1 ingress filter protocol all pref 1 matchall chain 0 filter protocol all pref 1 matchall chain 0 handle 0x1 in_hw action order 1: gact action drop random type none pass val 0 index 1 ref 1 bind 1 $ tc filter del dev ens16np1 ingress pref 1 handle 1 matchall action drop RTNETLINK answers: Operation not supported Implement tcf_proto_ops->delete() op and allow user to delete the filter. Reported-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-17 14:05:32 -07:00
Colin Ian King	ad9bf54519	net: hns3: fix dereference of ae_dev before it is null checked Pointer ae_dev is null checked however, prior to that it is dereferenced when assigned pointer ops. Fix this by assigning pointer ops after ae_dev has been null checked. Addresses-Coverity: ("Dereference before null check") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-17 14:02:57 -07:00
David S. Miller	43321251e2	Merge branch 'net-sched-act_ctinfo-fixes' Kevin Darbyshire-Bryant says: ==================== net: sched: act_ctinfo: fixes This is first attempt at sending a small series. Order is important because one bug (policy validation) prevents us from encountering the more important 'OOPS' generating bug in action creation. Fix the OOPS first. Confession time: Until very recently, development of this module has been done on 'net-next' tree to 'clean compile' level with run-time testing on backports to 4.14 & 4.19 kernels under openwrt. It turns out that sched: action: based code has been under more active change than I realised. During the back & forward porting during development & testing, the critical ACT_P_CREATED return code got missed despite being in the 4.14 & 4.19 backports. I have now gone through the init functions, using act_csum as reference with a fine toothed comb and am happy they do the same things. This issue hadn't been caught till now due to another issue caused by new strict nla_parse_nested function failing parsing validation before action creation. Thanks to Marcelo Leitner <marcelo.leitner@gmail.com> for flagging extack deficiency (fixed in `733f0766c3` sched: act_ctinfo: use extack error reporting) which led to `b424e432e7` ("netlink: add validation of NLA_F_NESTED flag") and `8cb081746c` ("netlink: make validation more configurable for future strictness”) which led to the policy validation fix, which then led to the action creation fix both contained in this series. If I ever get to a developer conference please feel free to tar/feather/apply cone of shame. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-17 14:00:30 -07:00
Kevin Darbyshire-Bryant	c197d63627	net: sched: act_ctinfo: fix policy validation Fix nla_policy definition by specifying an exact length type attribute to CTINFO action paraneter block structure. Without this change, netlink parsing will fail validation and the action will not be instantiated. `8cb081746c` ("netlink: make validation more configurable for future") introduced much stricter checking to attributes being passed via netlink. Existing actions were updated to use less restrictive deprecated versions of nla_parse_nested. As a new module, act_ctinfo should be designed to use the strict checking model otherwise, well, what was the point of implementing it. Confession time: Until very recently, development of this module has been done on 'net-next' tree to 'clean compile' level with run-time testing on backports to 4.14 & 4.19 kernels under openwrt. This is how I managed to miss the run-time impacts of the new strict nla_parse_nested function. I hopefully have learned something from this (glances toward laptop running a net-next kernel) There is however a still outstanding implication on iproute2 user space in that it needs to be told to pass nested netlink messages with the nested attribute actually set. So even with this kernel fix to do things correctly you still cannot instantiate a new 'strict' nla_parse_nested based action such as act_ctinfo with iproute2's tc. Signed-off-by: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-17 14:00:30 -07:00
Kevin Darbyshire-Bryant	a658c2e49f	net: sched: act_ctinfo: fix action creation Use correct return value on action creation: ACT_P_CREATED. The use of incorrect return value could result in a situation where the system thought a ctinfo module was listening but actually wasn't instantiated correctly leading to an OOPS in tcf_generic_walker(). Confession time: Until very recently, development of this module has been done on 'net-next' tree to 'clean compile' level with run-time testing on backports to 4.14 & 4.19 kernels under openwrt. During the back & forward porting during development & testing, the critical ACT_P_CREATED return code got missed despite being in the 4.14 & 4.19 backports. I have now gone through the init functions, using act_csum as reference with a fine toothed comb. Bonus, no more OOPSes. I managed to also miss this issue till now due to the new strict nla_parse_nested function failing validation before action creation. As an inexperienced developer I've learned that copy/pasting/backporting/forward porting code correctly is hard. If I ever get to a developer conference I shall don the cone of shame. Signed-off-by: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-17 14:00:30 -07:00
Greg Kroah-Hartman	9b9410766f	Merge branch 'erofs_fix' into staging-linus A branch is needed here to get the fix into staging-next as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-06-17 22:59:28 +02:00
Gao Xiang	5efe5137f0	staging: erofs: add requirements field in superblock There are some backward incompatible features pending for months, mainly due to on-disk format expensions. However, we should ensure that it cannot be mounted with old kernels. Otherwise, it will causes unexpected behaviors. Fixes: `ba2b77a820` ("staging: erofs: add super block operations") Cc: <stable@vger.kernel.org> # 4.19+ Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-06-17 22:58:58 +02:00
Jason Wang	098eadce3c	vhost_net: disable zerocopy by default Vhost_net was known to suffer from HOL[1] issues which is not easy to fix. Several downstream disable the feature by default. What's more, the datapath was split and datacopy path got the support of batching and XDP support recently which makes it faster than zerocopy part for small packets transmission. It looks to me that disable zerocopy by default is more appropriate. It cold be enabled by default again in the future if we fix the above issues. [1] https://patchwork.kernel.org/patch/3787671/ Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-17 13:58:02 -07:00
Ard Biesheuvel	c681edae33	net: ipv4: move tcp_fastopen server side code to SipHash library Using a bare block cipher in non-crypto code is almost always a bad idea, not only for security reasons (and we've seen some examples of this in the kernel in the past), but also for performance reasons. In the TCP fastopen case, we call into the bare AES block cipher one or two times (depending on whether the connection is IPv4 or IPv6). On most systems, this results in a call chain such as crypto_cipher_encrypt_one(ctx, dst, src) crypto_cipher_crt(tfm)->cit_encrypt_one(crypto_cipher_tfm(tfm), ...); aesni_encrypt kernel_fpu_begin(); aesni_enc(ctx, dst, src); // asm routine kernel_fpu_end(); It is highly unlikely that the use of special AES instructions has a benefit in this case, especially since we are doing the above twice for IPv6 connections, instead of using a transform which can process the entire input in one go. We could switch to the cbcmac(aes) shash, which would at least get rid of the duplicated overhead in some cases (i.e., today, only arm64 has an accelerated implementation of cbcmac(aes), while x86 will end up using the generic cbcmac template wrapping the AES-NI cipher, which basically ends up doing exactly the above). However, in the given context, it makes more sense to use a light-weight MAC algorithm that is more suitable for the purpose at hand, such as SipHash. Since the output size of SipHash already matches our chosen value for TCP_FASTOPEN_COOKIE_SIZE, and given that it accepts arbitrary input sizes, this greatly simplifies the code as well. NOTE: Server farms backing a single server IP for load balancing purposes and sharing a single fastopen key will be adversely affected by this change unless all systems in the pool receive their kernel upgrades at the same time. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-17 13:56:26 -07:00
Greg Kroah-Hartman	d7a5417b89	Second set of IIO fixes for the 5.2 cycle. * ad7150 - sense of bit for controlling adaptive vs fixed threshold was flipped. * adt7316 - Fix a build issue due to wrong headers for gpio usage. * lsm6dsx - correctly suspend / resume i2c slaves when the host goes to sleep. * mlx90632 - relax a compatability check to allow for newer devices. Also one counters fix * counter/ftm-quaddec - missing dependencies in Kconfig. -----BEGIN PGP SIGNATURE----- iQJFBAABCAAvFiEEbilms4eEBlKRJoGxVIU0mcT0FogFAl0H5PURHGppYzIzQGtl cm5lbC5vcmcACgkQVIU0mcT0Foh7bhAAqF/VG9KhIioY4ov5U+Exh5OGzirhTLKC TyuK4fnSHO9avql1lTAc9b/tt2tBdE4f0vel6CSP+1GlbwfG01AYUu+ZW7j/GkgT D09o4IH1TspqH5nIc2JunZwYqPJG2v7Fis9IMtS13eTJ+csf9EhVCuIf8tnaPwyM /OH7t3JdSqxtsxHECEsZhl5IDrqZKyQI3+7MtlpcnuQpNjrrwxey4B54csZamUH8 nhefM7CtI+roaT+Ydp5oNpvpVXhNH54rVKwEkJfeuhiuviHoO7SgUF59u4vp9JYa 5dVo+T8+b2gsAt1w4jDCNDALTrFYL1d4XCyHuW+ZvpTGfb99E0g2QRStuBJp0Nyn 9s9K05KTdcHi3HBUxx+pKh7MDUBzKSm5T5euq4ZHRUKF3vjfBhsoN4iVh31IHa53 Nv28vnnxUd7RXHZonoi9wIT0zoDSY1IvpNGtwJaOjBJNcUjtDQivV2KqqvvvSfj0 g7KswaUB5icRR/mtL4Jj2j0voD2e+mMXgknRmHJthXKBKzs4lQHFPqI1rZQy8DRg 2YCi5zJNPBmf7hNbMqyt5YEkflZST3zzOUBRxdju4w3lwKZl6+bQd5B82ET6DZSr n99WXsGcdck/JEWomUhGb5uTbS2LAgsVLIXJTDzgzwQqE4Vp7FFYIHDNeTJx43Cm bEvNn0cT6a8= =2i3G -----END PGP SIGNATURE----- Merge tag 'iio-fixes-for-5.2b' of git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-linus Jonathan writes: Second set of IIO fixes for the 5.2 cycle. * ad7150 - sense of bit for controlling adaptive vs fixed threshold was flipped. * adt7316 - Fix a build issue due to wrong headers for gpio usage. * lsm6dsx - correctly suspend / resume i2c slaves when the host goes to sleep. * mlx90632 - relax a compatability check to allow for newer devices. Also one counters fix * counter/ftm-quaddec - missing dependencies in Kconfig. * tag 'iio-fixes-for-5.2b' of git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio: counter/ftm-quaddec: Add missing dependencies in Kconfig staging: iio: adt7316: Fix build errors when GPIOLIB is not set iio: temperature: mlx90632 Relax the compatibility check iio: imu: st_lsm6dsx: fix PM support for st_lsm6dsx i2c controller staging:iio:ad7150: fix threshold mode config bit	2019-06-17 22:28:29 +02:00
Tuong Lien	6a6b5c8bff	tipc: include retrans failure detection for unicast In patch series, commit `9195948fbf` ("tipc: improve TIPC throughput by Gap ACK blocks"), as for simplicity, the repeated retransmit failures' detection in the function - "tipc_link_retrans()" was kept there for broadcast retransmissions only. This commit now reapplies this feature for link unicast retransmissions that has been done via the function - "tipc_link_advance_transmq()". Also, the "tipc_link_retrans()" is renamed to "tipc_link_bc_retrans()" as it is used only for broadcast. Acked-by: Jon Maloy <jon.maloy@ericsson.se> Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-17 13:27:32 -07:00
Hangbin Liu	9ed68ca0d9	team: add ethtool get_link_ksettings Like bond, add ethtool get_link_ksettings to show the total speed. v2: no update, just repost. Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-17 13:23:12 -07:00
David S. Miller	4fddbf8a99	Merge branch 'tcp-fixes' Eric Dumazet says: ==================== tcp: make sack processing more robust Jonathan Looney brought to our attention multiple problems in TCP stack at the sender side. SACK processing can be abused by malicious peers to either cause overflows, or increase of memory usage. First two patches fix the immediate problems. Since the malicious peers abuse senders by advertizing a very small MSS in their SYN or SYNACK packet, the last two patches add a new sysctl so that admins can chose a higher limit for MSS clamping. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-17 10:39:56 -07:00
Anisse Astier	adeaa21a4b	arm64: ssbd: explicitly depend on <linux/prctl.h> Fix ssbd.c which depends implicitly on asm/ptrace.h including linux/prctl.h (through for example linux/compat.h, then linux/time.h, linux/seqlock.h, linux/spinlock.h and linux/irqflags.h), and uses PR_SPEC* defines. This is an issue since we'll soon be removing the include from asm/ptrace.h. Fixes: `9cdc0108ba` ("arm64: ssbd: Add prctl interface for per-thread mitigation") Cc: stable@vger.kernel.org Signed-off-by: Anisse Astier <aastier@freebox.fr> Signed-off-by: Will Deacon <will.deacon@arm.com>	2019-06-17 18:38:10 +01:00
Linus Torvalds	eb7c825bf7	RISC-V patches for v5.2-rc6 This tag contains fixes, defconfig, and DT data changes for the v5.2-rc series. The fixes are relatively straightforward: - Addition of a TLB fence in the vmalloc_fault path, so the CPU doesn't enter an infinite page fault loop; - Readdition of the pm_power_off export, so device drivers that reassign it can now be built as modules; - A udelay() fix for RV32, fixing a miscomputation of the delay time; - Removal of deprecated smp_mb__() barriers. The tag also adds initial DT data infrastructure for arch/riscv, along with initial data for the SiFive FU540-C000 SoC and the corresponding HiFive Unleashed board. We also update the RV64 defconfig to include some core drivers for the FU540 in the build. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEElRDoIDdEz9/svf2Kx4+xDQu9KksFAl0HtEkACgkQx4+xDQu9 KkuRIw//f2vSrUyMh44sevr6euVD0K++hQ0AbteQ94cGHqYWWaNxfwMHFD91Gxbj wowTwgssq7H9nePsKANjiiLULnZNIkWXAlIncjzv3aXkH6JG3f9nEGR49yzvCbIZ yN8wgElJ8rcVWLd096E53Su84CzxuJJ2o3wOI1nQi8aI4h3LwkM2b/O4GxZFpnWb vIhWXqjvbUb8XL7Y+VPewtxnZItOUDHkuIkup4kP2bTgl2iDW93hzWwxNKbt6v+m 9wTzAChjcepCAXSmEGeeZ/h2HNqw2crs+NWOe0drcKxL2vKPZ6gS8ZRX/NuIoDr4 JgMILzYSO28z8N6w1cJJUdN4eGhCTvdxVTQXvkk/yZoT08X6M0xb5A1MbtizgOJ6 mZK/vM9gtuoUSZG0SRNeNoqHbWu1tIm29z435Be8hWAtzXlEfewJm8ntgFO4dGmb E8TRSgjLzdHY0Nvwx/KVtvYmE/TMybVVRsxJJ525dqJlHT7f3VuRstvw7VQJQpz2 +JfsZbYk1KjbUc25QpAqF1LUxrRQFn2JL0Cqw+L49J8eshY77rsTcAKP6ZZWiSFZ qodU0oPF4BkS1t0bnFuNwlqsAr/q9EiAnQO7+SvqQY/ZUnMNk9gCNn5k/rHMCfyD 2Dyo6iAbj+Yyb1rrQxX6QnlbHgpFxsG3N4s9E5jOPgKyEQM4JQ4= =aotJ -----END PGP SIGNATURE----- Merge tag 'riscv-for-v5.2/fixes-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Paul Walmsley: "This contains fixes, defconfig, and DT data changes for the v5.2-rc series. The fixes are relatively straightforward: - Addition of a TLB fence in the vmalloc_fault path, so the CPU doesn't enter an infinite page fault loop - Readdition of the pm_power_off export, so device drivers that reassign it can now be built as modules - A udelay() fix for RV32, fixing a miscomputation of the delay time - Removal of deprecated smp_mb__() barriers This also adds initial DT data infrastructure for arch/riscv, along with initial data for the SiFive FU540-C000 SoC and the corresponding HiFive Unleashed board. We also update the RV64 defconfig to include some core drivers for the FU540 in the build" * tag 'riscv-for-v5.2/fixes-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: remove unused barrier defines riscv: mm: synchronize MMU after pte change riscv: dts: add initial board data for the SiFive HiFive Unleashed riscv: dts: add initial support for the SiFive FU540-C000 SoC dt-bindings: riscv: convert cpu binding to json-schema dt-bindings: riscv: sifive: add YAML documentation for the SiFive FU540 arch: riscv: add support for building DTB files from DT source data riscv: Fix udelay in RV32. riscv: export pm_power_off again RISC-V: defconfig: enable clocks, serial console	2019-06-17 10:34:03 -07:00
Christoph Hellwig	4569180495	block: fix page leak when merging to same page When multiple iovecs reference the same page, each get_user_page call will add a reference to the page. But once we've created the bio that information gets lost and only a single reference will be dropped after I/O completion. Use the same_page information returned from __bio_try_merge_page to drop additional references to pages that were already present in the bio. Based on a patch from Ming Lei. Link: https://lkml.org/lkml/2019/4/23/64 Fixes: `576ed913` ("block: use bio_add_page in bio_iov_iter_get_pages") Reported-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-06-17 09:33:04 -06:00
Christoph Hellwig	ff896738be	block: return from __bio_try_merge_page if merging occured in the same page We currently have an input same_page parameter to __bio_try_merge_page to prohibit merging in the same page. The rationale for that is that some callers need to account for every page added to a bio. Instead of letting these callers call twice into the merge code to account for the new vs existing page cases, just turn the paramter into an output one that returns if a merge in the same page occured and let them act accordingly. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-06-17 09:33:02 -06:00
Filipe Manana	3763771cf6	Btrfs: fix failure to persist compression property xattr deletion on fsync After the recent series of cleanups in the properties and xattrs modules that landed in the 5.2 merge window, we ended up with a regression where after deleting the compression xattr property through the setflags ioctl, we don't set the BTRFS_INODE_COPY_EVERYTHING flag in the inode anymore. As a consequence, if the inode was fsync'ed when it had the compression property set, after deleting the compression property through the setflags ioctl and fsync'ing again the inode, the log will still contain the compression xattr, because the inode did not had that bit set, which made the fsync not delete all xattrs from the log and copy all xattrs from the subvolume tree to the log tree. This regression happens due to the fact that that series of cleanups made btrfs_set_prop() call the old function do_setxattr() (which is now named btrfs_setxattr()), and not the old version of btrfs_setxattr(), which is now called btrfs_setxattr_trans(). Fix this by setting the BTRFS_INODE_COPY_EVERYTHING bit in the current btrfs_setxattr() function and remove it from everywhere else, including its setup at btrfs_ioctl_setflags(). This is cleaner, avoids similar regressions in the future, and centralizes the setup of the bit. After all, the need to setup this bit should only be in the xattrs module, since it is an implementation of xattrs. Fixes: `04e6863b19` ("btrfs: split btrfs_setxattr calls regarding transaction") CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2019-06-17 16:37:17 +02:00
Rolf Eike Beer	259931fd3b	riscv: remove unused barrier defines They were introduced in commit `fab957c11e` ("RISC-V: Atomic and Locking Code") long after commit `2e39465abc` ("locking: Remove deprecated smp_mb__() barriers") removed the remnants of all previous instances from the tree. Signed-off-by: Rolf Eike Beer <eb@emlix.com> [paul.walmsley@sifive.com: stripped spurious mbox header from patch description; fixed commit references in patch header] Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>	2019-06-17 07:09:43 -07:00
Peter Chen	c19dffc0a9	usb: chipidea: udc: workaround for endpoint conflict issue An endpoint conflict occurs when the USB is working in device mode during an isochronous communication. When the endpointA IN direction is an isochronous IN endpoint, and the host sends an IN token to endpointA on another device, then the OUT transaction may be missed regardless the OUT endpoint number. Generally, this occurs when the device is connected to the host through a hub and other devices are connected to the same hub. The affected OUT endpoint can be either control, bulk, isochronous, or an interrupt endpoint. After the OUT endpoint is primed, if an IN token to the same endpoint number on another device is received, then the OUT endpoint may be unprimed (cannot be detected by software), which causes this endpoint to no longer respond to the host OUT token, and thus, no corresponding interrupt occurs. There is no good workaround for this issue, the only thing the software could do is numbering isochronous IN from the highest endpoint since we have observed most of device number endpoint from the lowest. Cc: <stable@vger.kernel.org> #v3.14+ Cc: Fabio Estevam <festevam@gmail.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Cc: Jun Li <jun.li@nxp.com> Signed-off-by: Peter Chen <peter.chen@nxp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-06-17 15:08:33 +02:00
Andy Gross	4b576d15df	MAINTAINERS: Change QCOM repo location This patch updates the Qualcomm SoC repo to a new location. Signed-off-by: Andy Gross <agross@kernel.org> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Olof Johansson <olof@lixom.net>	2019-06-17 05:13:01 -07:00
jjian zhou	20314ce30a	mmc: mediatek: fix SDIO IRQ detection issue If cmd19 timeout or response crcerr occurs during execute_tuning(), it need invoke msdc_reset_hw(). Otherwise SDIO IRQ can't be detected. Signed-off-by: jjian zhou <jjian.zhou@mediatek.com> Signed-off-by: Chaotian Jing <chaotian.jing@mediatek.com> Signed-off-by: Yong Mao <yong.mao@mediatek.com> Fixes: `5215b2e952` ("mmc: mediatek: Add MMC_CAP_SDIO_IRQ support") Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>	2019-06-17 13:26:50 +02:00
jjian zhou	8a5df8ac62	mmc: mediatek: fix SDIO IRQ interrupt handle flow SDIO IRQ is triggered by low level. It need disable SDIO IRQ detected function. Otherwise the interrupt register can't be cleared. It will process the interrupt more. Signed-off-by: Jjian Zhou <jjian.zhou@mediatek.com> Signed-off-by: Chaotian Jing <chaotian.jing@mediatek.com> Signed-off-by: Yong Mao <yong.mao@mediatek.com> Fixes: `5215b2e952` ("mmc: mediatek: Add MMC_CAP_SDIO_IRQ support") Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>	2019-06-17 13:25:44 +02:00
Wolfram Sang	b0e370b95a	mmc: core: complete HS400 before checking status We don't have a reproducible error case, yet our BSP team suggested that the mmc_switch_status() command in mmc_select_hs400() should come after the callback into the driver completing HS400 setup. It makes sense to me because we want the status of a fully setup HS400, so it will increase the reliability of the mmc_switch_status() command. Reported-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Fixes: `ba6c7ac3a2` ("mmc: core: more fine-grained hooks for HS400 tuning") Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>	2019-06-17 13:18:11 +02:00
ShihPo Hung	bf587caae3	riscv: mm: synchronize MMU after pte change Because RISC-V compliant implementations can cache invalid entries in TLB, an SFENCE.VMA is necessary after changes to the page table. This patch adds an SFENCE.vma for the vmalloc_fault path. Signed-off-by: ShihPo Hung <shihpo.hung@sifive.com> [paul.walmsley@sifive.com: reversed tab->whitespace conversion, wrapped comment lines] Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@sifive.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: linux-riscv@lists.infradead.org Cc: stable@vger.kernel.org	2019-06-17 03:44:44 -07:00
Will Deacon	c584b1202f	MAINTAINERS: Update my email address to use @kernel.org My @arm.com address will stop working at the end of August, so update to my @kernel.org address where you'll still be able to reach me. When I say "stop working" I really mean "will go to my line manager", so send patches there at your peril because they may reply with roadmaps and spreadsheets. You have been warned. Cc: Will Deacon <will@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Russell King <linux@armlinux.org.uk> Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Joerg Roedel <joro@8bytes.org> Cc: arm-soc <arm@kernel.org> Signed-off-by: Will Deacon <will.deacon@arm.com>	2019-06-17 11:37:01 +01:00

... 5 6 7 8 9 ...

842748 commits