linux-hardened/kernel
Daniel Borkmann d8eca5bbb2 bpf: implement lookup-free direct value access for maps
This generic extension to BPF maps allows for directly loading
an address residing inside a BPF map value as a single BPF
ldimm64 instruction!

The idea is similar to what BPF_PSEUDO_MAP_FD does today, which
is a special src_reg flag for ldimm64 instruction that indicates
that inside the first part of the double insns's imm field is a
file descriptor which the verifier then replaces as a full 64bit
address of the map into both imm parts. For the newly added
BPF_PSEUDO_MAP_VALUE src_reg flag, the idea is the following:
the first part of the double insns's imm field is again a file
descriptor corresponding to the map, and the second part of the
imm field is an offset into the value. The verifier will then
replace both imm parts with an address that points into the BPF
map value at the given value offset for maps that support this
operation. Currently supported is array map with single entry.
It is possible to support more than just single map element by
reusing both 16bit off fields of the insns as a map index, so
full array map lookup could be expressed that way. It hasn't
been implemented here due to lack of concrete use case, but
could easily be done so in future in a compatible way, since
both off fields right now have to be 0 and would correctly
denote a map index 0.

The BPF_PSEUDO_MAP_VALUE is a distinct flag as otherwise with
BPF_PSEUDO_MAP_FD we could not differ offset 0 between load of
map pointer versus load of map's value at offset 0, and changing
BPF_PSEUDO_MAP_FD's encoding into off by one to differ between
regular map pointer and map value pointer would add unnecessary
complexity and increases barrier for debugability thus less
suitable. Using the second part of the imm field as an offset
into the value does /not/ come with limitations since maximum
possible value size is in u32 universe anyway.

This optimization allows for efficiently retrieving an address
to a map value memory area without having to issue a helper call
which needs to prepare registers according to calling convention,
etc, without needing the extra NULL test, and without having to
add the offset in an additional instruction to the value base
pointer. The verifier then treats the destination register as
PTR_TO_MAP_VALUE with constant reg->off from the user passed
offset from the second imm field, and guarantees that this is
within bounds of the map value. Any subsequent operations are
normally treated as typical map value handling without anything
extra needed from verification side.

The two map operations for direct value access have been added to
array map for now. In future other types could be supported as
well depending on the use case. The main use case for this commit
is to allow for BPF loader support for global variables that
reside in .data/.rodata/.bss sections such that we can directly
load the address of them with minimal additional infrastructure
required. Loader support has been added in subsequent commits for
libbpf library.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-09 17:05:46 -07:00
..
bpf bpf: implement lookup-free direct value access for maps 2019-04-09 17:05:46 -07:00
cgroup Merge branch 'work.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-03-12 14:08:19 -07:00
configs kvm_config: add CONFIG_VIRTIO_MENU 2018-10-24 20:55:56 -04:00
debug kdb: use bool for binary state indicators 2018-12-30 08:31:52 +00:00
dma memblock: drop memblock_alloc_*_nopanic() variants 2019-03-12 10:04:02 -07:00
events perf/core improvements and fixes: 2019-03-22 22:50:41 +01:00
gcov kernel/gcov/gcc_3_4.c: use struct_size() in kzalloc() 2019-03-07 18:32:02 -08:00
irq genirq: Mark expected switch case fall-through 2019-03-23 12:32:01 +01:00
livepatch Merge branch 'for-5.1/atomic-replace' into for-linus 2019-03-05 15:56:59 +01:00
locking locking/lockdep: Only call init_rcu_head() after RCU has been initialized 2019-03-09 14:15:51 +01:00
power treewide: add checks for the return value of memblock_alloc*() 2019-03-12 10:04:02 -07:00
printk fbdev changes for v5.1: 2019-03-15 14:22:59 -07:00
rcu Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-03-06 07:59:36 -08:00
sched Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-03-24 11:42:10 -07:00
time time/jiffies: Make refined_jiffies static 2019-03-22 13:38:26 +01:00
trace ftrace: Fix warning using plain integer as NULL & spelling corrections 2019-03-26 08:35:36 -04:00
.gitignore kernel/configs: use .incbin directive to embed config_data.gz 2019-03-07 18:32:02 -08:00
acct.c
async.c async: Add support for queueing on specific NUMA node 2019-01-31 14:20:54 +01:00
audit.c audit: remove audit_context when CONFIG_ AUDIT and not AUDITSYSCALL 2019-02-03 17:49:35 -05:00
audit.h audit: hide auditsc_get_stamp and audit_serial prototypes 2019-02-07 21:44:27 -05:00
audit_fsnotify.c audit: add syscall information to CONFIG_CHANGE records 2019-01-18 17:53:29 -05:00
audit_tree.c audit: hand taken context to audit_kill_trees for syscall logging 2019-01-14 18:01:05 -05:00
audit_watch.c audit: add syscall information to CONFIG_CHANGE records 2019-01-18 17:53:29 -05:00
auditfilter.c audit: mark expected switch fall-through 2019-02-12 20:17:13 -05:00
auditsc.c audit: remove audit_context when CONFIG_ AUDIT and not AUDITSYSCALL 2019-02-03 17:49:35 -05:00
backtracetest.c
bounds.c kbuild: fix kernel/bounds.c 'W=1' warning 2018-10-31 08:54:14 -07:00
capability.c LSM: add SafeSetID module that gates setid calls 2019-01-25 11:22:43 -08:00
compat.c time: make adjtime compat handling available for 32 bit 2019-02-07 00:13:27 +01:00
configs.c kernel/configs: use .incbin directive to embed config_data.gz 2019-03-07 18:32:02 -08:00
context_tracking.c
cpu.c Linux 5.0-rc5 2019-02-04 08:57:24 +01:00
cpu_pm.c
crash_core.c kexec: export PG_offline to VMCOREINFO 2019-03-05 21:07:14 -08:00
crash_dump.c
cred.c SELinux: Remove cred security blob poisoning 2019-01-08 13:18:44 -08:00
delayacct.c delayacct: track delays from thrashing cache pages 2018-10-26 16:26:32 -07:00
dma.c
elfcore.c
exec_domain.c
exit.c Merge branch 'for-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2019-03-07 10:11:41 -08:00
extable.c
fail_function.c kernel/fail_function.c: remove meaningless null pointer check before debugfs_remove_recursive 2018-10-31 08:54:12 -07:00
fork.c 5.1 Merge Window Pull Request 2019-03-09 15:53:03 -08:00
freezer.c PM / reboot: Eliminate race between reboot and suspend 2018-08-06 12:35:20 +02:00
futex.c futex: Ensure that futex address is aligned in handle_futex_death() 2019-03-22 13:05:26 +01:00
groups.c
hung_task.c kernel/hung_task.c: Use continuously blocked time when reporting. 2019-03-07 18:31:59 -08:00
iomem.c
irq_work.c
jump_label.c jump_label: move 'asm goto' support test to Kconfig 2019-01-06 09:46:51 +09:00
kallsyms.c bpf: Add module name [bpf] to ksymbols for bpf programs 2019-01-21 17:38:56 -03:00
kcmp.c
Kconfig.freezer
Kconfig.hz
Kconfig.locks bpf: introduce bpf_spin_lock 2019-02-01 20:55:38 +01:00
Kconfig.preempt kconfig: warn no new line at end of file 2018-12-15 17:44:35 +09:00
kcov.c kcov: convert kcov.refcount to refcount_t 2019-03-07 18:32:02 -08:00
kexec.c
kexec_core.c mm: convert totalram_pages and totalhigh_pages variables to atomic 2018-12-28 12:11:47 -08:00
kexec_file.c kexec_file: kexec_walk_memblock() only walks a dedicated region at kdump 2018-12-06 14:38:50 +00:00
kexec_internal.h
kmod.c
kprobes.c kprobes: Search non-suffixed symbol in blacklist 2019-02-13 08:16:40 +01:00
ksysfs.c
kthread.c Merge branch 'akpm' (patches from Andrew) 2019-03-06 10:31:36 -08:00
latencytop.c
Makefile kernel/configs: use .incbin directive to embed config_data.gz 2019-03-07 18:32:02 -08:00
memremap.c mm/hmm: fix memremap.h, move dev_page_fault_t callback to hmm 2018-12-28 12:11:52 -08:00
module-internal.h
module.c dynamic_debug: add static inline stub for ddebug_add_module 2019-03-07 18:32:00 -08:00
module_signing.c modsign: use all trusted keys to verify module signature 2018-11-07 14:41:41 +01:00
notifier.c
nsproxy.c
padata.c padata: clean an indentation issue, remove extraneous space 2018-11-16 14:11:04 +08:00
panic.c kernel/panic.c: taint: fix debugfs_simple_attr.cocci warnings 2019-03-07 18:31:59 -08:00
params.c
pid.c Fix failure path in alloc_pid() 2018-12-28 12:42:30 -08:00
pid_namespace.c signal: Use group_send_sig_info to kill all processes in a pid namespace 2018-09-16 16:08:25 +02:00
profile.c mm: remove include/linux/bootmem.h 2018-10-31 08:54:16 -07:00
ptrace.c Remove 'type' argument from access_ok() function 2019-01-03 18:57:57 -08:00
range.c
reboot.c kernel/reboot.c: export pm_power_off_prepare 2018-09-11 16:13:24 +01:00
relay.c Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-03-12 13:27:20 -07:00
resource.c device-dax for 5.1 2019-03-16 13:05:32 -07:00
rseq.c Remove 'type' argument from access_ok() function 2019-01-03 18:57:57 -08:00
seccomp.c Merge branch 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security 2019-03-07 11:44:01 -08:00
signal.c pidfd patches for v5.1-rc1 2019-03-16 13:47:14 -07:00
smp.c cpu/hotplug: Fix "SMT disabled by BIOS" detection for KVM 2019-01-30 19:27:00 +01:00
smpboot.c
smpboot.h
softirq.c softirq: Don't skip softirq execution when softirq thread is parking 2019-02-10 21:51:39 +01:00
stackleak.c stackleak: Mark stackleak_track_stack() as notrace 2018-12-05 19:31:44 -08:00
stacktrace.c
stop_machine.c Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-08-13 11:25:07 -07:00
sys.c Merge branch 'akpm' (patches from Andrew) 2019-03-07 19:25:37 -08:00
sys_ni.c pidfd patches for v5.1-rc1 2019-03-16 13:47:14 -07:00
sysctl.c kernel/sysctl.c: define minmax conv functions in terms of non-minmax versions 2019-03-12 10:04:00 -07:00
sysctl_binary.c kernel/sysctl: add panic_print into sysctl 2019-01-04 13:13:47 -08:00
task_work.c
taskstats.c genetlink: make policy common to family 2019-03-22 10:38:23 -04:00
test_kprobes.c
torture.c Merge branches 'doc.2019.01.26a', 'fixes.2019.01.26a', 'sil.2019.01.26a', 'spdx.2019.02.09a', 'srcu.2019.01.26a' and 'torture.2019.01.26a' into HEAD 2019-02-09 08:47:52 -08:00
tracepoint.c tracing: Replace synchronize_sched() and call_rcu_sched() 2018-11-27 09:21:41 -08:00
tsacct.c
ucount.c
uid16.c
uid16.h
umh.c umh: add exit routine for UMH process 2019-01-11 18:05:40 -08:00
up.c smp,cpumask: introduce on_each_cpu_cond_mask 2018-10-09 16:51:11 +02:00
user-return-notifier.c
user.c userns: use irqsave variant of refcount_dec_and_lock() 2018-08-22 10:52:47 -07:00
user_namespace.c userns: also map extents in the reverse map to kernel IDs 2018-11-07 23:51:16 -06:00
utsname.c
utsname_sysctl.c sys: don't hold uts_sem while accessing userspace memory 2018-08-11 02:05:53 -05:00
watchdog.c watchdog/core: Make variables static 2019-03-22 13:40:17 +01:00
watchdog_hld.c watchdog: Mark watchdog touch functions as notrace 2018-08-30 12:56:40 +02:00
workqueue.c workqueue: Only unregister a registered lockdep key 2019-03-21 12:00:18 +01:00
workqueue_internal.h psi: fix aggregation idle shut-off 2019-02-01 15:46:23 -08:00