The reported problem with integrity verification on ARM64 (#269)
is a result of a very tight race condition with tracepoints.
Changes which simplify synchronization with JUMP_LABEL engine:
f98da1b17c
affected differently ARM64 platform which made such race possible.
However, potentially the same race problem may exist on x86 and
this commit fixes it and should address #269
Linux kernel post-6.3 modified the 'struct module' and introduced a new
substructure describing module's memory layout. Additionally, the logic for
dynamic debug (ddebug) has been modified and some of the functions which LKRG
uses are no longer exported. This commit adopts to these post-6.3 changed and
addresses #267
Use macros, move logging and enforcement responses from callers into
called functions, remove where it was duplicate.
Unify our log and kernel panic messages.
This commit brings a few important changes:
- LKRG has used to leverage SLAB_HWCACHE_ALIGN but memory overhead
may be too significant for LKRG's use cases
- Since the kernel 4.5+ we can use SLAB_ACCOUNT to make sure that
LKRG's caches are standalone
- Modify the size of pCFI stack buffer cache to be smaller and
decoupled from the PAGE_SIZE (there is no reason for that)
Additionally, this commit should help addressing #131
On the latest kernels (5.15+) get/put_onlince_cpus() API is deprecated
and new synchronization functions must be used. This commit addressed
that issue and #118
New Linux kernels may be built with the CONFIG_GCC_PLUGIN_RANDSTRUCT
option. This randomly changes the order of fields in certain structures,
including selinux_state. Currently, LKRG isn't capable to recreate the
structure layout. Thus, we have to disable LKRG's SELinux monitoring on
kernels built with this option.
CONFIG_GCC_PLUGIN_RANDSTRUCT was introduced to make it harder for attackers
to overwrite particular fields of structures. LKRG's goal was the same.
So even disabling LKRG's monitoring, we still have some mitigations for
SELinux state overwrites.
We might make LKRG capable to recreate randomized structures in the future.
We do not want to support RT kernels (at least not for now). RT kernels are
commonly used in medical and similar devices, where reliability is crucial.
It is safer to to not support RT kernels in LKRG for now.
For more information please read entire discussion at #40.
Since kernel 5.8+ 'native_write_cr4' must be manually resolved. However, this is X86 specific code which should nbot be executed on other platforms. This commit fixes that and addresses #48
Some custom compilation of the kernel might aggresively inline critical
functions (from LKRG perspective). That's problematic for the project.
However, some of the problems *might* be solved by uncommenting this new
definition (P_KERNEL_AGGRESSIVE_INLINING). Unfortunately, not all of the
problems can be solved by it (at least no for now). You need to experiment.
This can be useful to address issues like #40
On the aggressively optimized kernels it is possible that kprobe optimizer
won't be fast enough to do the job before LKRG creates own database. This
is problematic because LKRG might snapshot hash of the kernel's .text
section with non-optimized own hooks. As soon as the kprobe optimizer
finishes the job, previously snapshoted hash won't be correct and LKRG will
detect this inconsistency.
To be able to correctly solve this unusual corner case problem, LKRG can
wait for kprobe optimizer before creating database.
p_kzfree() wraps kzfree() call for kernel < 5.10 and kfree_sensitive()
in the other case. This reflects the changes made in kernel since
23224e45004ed84c8466fd1e8e5860f541187029 and fix the build against
kernel 5.10.
We don't need to introduce custom LKRG-counter lock to synchronize with JUMP_LABEL engine and avoid potential deadlock with FTRACE. We can check if jump_label lock is taken after acquiring ftrace lock and before taking text_mutex.
This simplification changes p_text_section_(un)lock API.
This also fixes problem reported by Jacek
1) We are hooking into FTRACE's internal functions to be able to monitor when new modifications are executed and react accordingly.
2) Linux kernel has bugs in FTRACE code. The LKRG may highlight them.
3) We are introducing 'p_state_init' variable to track when full LKRG's initialization is complete.
1) This is necessary for future FTRACE support. FTRACE is not fully synchronized with JUMP_LABEL (which I think is a buggy logic in the kernel). However, we can manually add such logic. The way how text_mutex is used by both subsystems makes it prone to deadlock if 3rd system wants to sync with both of them.
2) New lock efnorces changes in p_text_section_(un)lock API which we do in the same commit
3) Introduce new LKRG's counter lock API - trylock
4) Add a few minor changes:
- notrace attribute (probably, we need to add such attributes to majority of our functions)
- add information about module name in case of KMOD notifier activity
With the current design of JUMP_LABEL support we do not need to manually take this mutex. Our hooks are deep enough to be protected and integrity routine depends on text mutext
Due to kernel commit f3ac60671954c ("sched/headers: Move task-stack
related APIs from <linux/sched.h> to <linux/sched/task_stack.h>") (Linux
v4.11) `linux/sched/task_stack.h' should be included to access
`task_stack_page'.
Compilation failure is appearing on armv8l arch:
In file included from ./include/linux/prefetch.h:15,
from ./arch/arm/include/asm/atomic.h:12,
from ./include/linux/atomic.h:7,
from ./include/asm-generic/bitops/lock.h:5,
from ./arch/arm/include/asm/bitops.h:243,
from ./include/linux/bitops.h:26,
from ./include/linux/kernel.h:12,
from /usr/src/RPM/BUILD/lkrg-0.8.1/src/modules/exploit_detection/../../p_lkrg_main.h:23,
from /usr/src/RPM/BUILD/lkrg-0.8.1/src/modules/exploit_detection/p_exploit_detection.c:18:
/usr/src/RPM/BUILD/lkrg-0.8.1/src/modules/exploit_detection/p_exploit_detection.c: In function 'p_iterate_processes':
./arch/arm/include/asm/processor.h:99:40: error: implicit declaration of function 'task_stack_page'; did you mean 'walk_stackframe'? [-Werror=implicit-function-declaration]
99 | ((struct pt_regs *)(THREAD_START_SP + task_stack_page(p)) - 1)
| ^~~~~~~~~~~~~~~
/usr/src/RPM/BUILD/lkrg-0.8.1/src/modules/exploit_detection/p_exploit_detection.c:779:30: note: in expansion of macro 'task_pt_regs'
779 | p_regs_set_ip(task_pt_regs(p_tmp), -1);
| ^~~~~~~~~~~~
cc1: some warnings being treated as errors
make[1]: *** [scripts/Makefile.build:265: /usr/src/RPM/BUILD/lkrg-0.8.1/src/modules/exploit_detection/p_exploit_detection.o] Error 1
Signed-off-by: Vitaly Chikunov <vt@altlinux.org>
This fixes LKRG build on Linux 5.8+, which renamed that header file. Thanks to
Andy Lavr for reporting this problem and suggesting a (different) fix, which
made us revisit our use of that header file.
We only need that header file on older kernels (< 4.4.72 or < RHEL 7.4) for the
one use of md5_transform() in get_random_long(). On newer kernels, we simply
use the kernel-provided get_random_long(). Further, 5.8's crypto/sha.h doesn't
declare md5_transform() anyway (linux/cryptohash.h on much older kernels did).
- Not all hooks are fatal. If for any reason non-fatal hook can't be placed, continue initialization and print appropriate message
- If hook is fatal, stop intialization
[2] Add support for ISRA optimized functions:
- Some of the functions might be optimized by ISRA. However, some of the hooks can still be functional even under ISRA optimized functions.