linux-hardened/arch/powerpc
Simon Guo d58badfb7c powerpc/64: enhance memcmp() with VMX instruction for long bytes comparision
This patch add VMX primitives to do memcmp() in case the compare size
is equal or greater than 4K bytes. KSM feature can benefit from this.

Test result with following test program(replace the "^>" with ""):
------
># cat tools/testing/selftests/powerpc/stringloops/memcmp.c
>#include <malloc.h>
>#include <stdlib.h>
>#include <string.h>
>#include <time.h>
>#include "utils.h"
>#define SIZE (1024 * 1024 * 900)
>#define ITERATIONS 40

int test_memcmp(const void *s1, const void *s2, size_t n);

static int testcase(void)
{
        char *s1;
        char *s2;
        unsigned long i;

        s1 = memalign(128, SIZE);
        if (!s1) {
                perror("memalign");
                exit(1);
        }

        s2 = memalign(128, SIZE);
        if (!s2) {
                perror("memalign");
                exit(1);
        }

        for (i = 0; i < SIZE; i++)  {
                s1[i] = i & 0xff;
                s2[i] = i & 0xff;
        }
        for (i = 0; i < ITERATIONS; i++) {
		int ret = test_memcmp(s1, s2, SIZE);

		if (ret) {
			printf("return %d at[%ld]! should have returned zero\n", ret, i);
			abort();
		}
	}

        return 0;
}

int main(void)
{
        return test_harness(testcase, "memcmp");
}
------
Without this patch (but with the first patch "powerpc/64: Align bytes
before fall back to .Lshort in powerpc64 memcmp()." in the series):
	4.726728762 seconds time elapsed                                          ( +-  3.54%)
With VMX patch:
	4.234335473 seconds time elapsed                                          ( +-  2.63%)
		There is ~+10% improvement.

Testing with unaligned and different offset version (make s1 and s2 shift
random offset within 16 bytes) can archieve higher improvement than 10%..

Signed-off-by: Simon Guo <wei.guo.simon@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-24 22:03:21 +10:00
..
boot powerpc/dts: Use a correct at24 compatible fallback in ac14xx 2018-07-10 10:58:40 +10:00
configs powerpc: Add ppc32_allmodconfig defconfig target 2018-07-24 22:03:15 +10:00
crypto crypto: hash - annotate algorithms taking optional key 2018-01-12 23:03:35 +11:00
include powerpc/64: enhance memcmp() with VMX instruction for long bytes comparision 2018-07-24 22:03:21 +10:00
kernel powerpc/mm: Check memblock_add against MAX_PHYSMEM_BITS range 2018-07-24 22:03:16 +10:00
kvm powerpc/powernv/ioda: Allocate indirect TCE levels on demand 2018-07-16 22:53:11 +10:00
lib powerpc/64: enhance memcmp() with VMX instruction for long bytes comparision 2018-07-24 22:03:21 +10:00
math-emu License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
mm powerpc/mm/hash: Reduce contention on hpte lock 2018-07-24 22:03:18 +10:00
net treewide: kzalloc() -> kcalloc() 2018-06-12 16:19:22 -07:00
oprofile treewide: kzalloc() -> kcalloc() 2018-06-12 16:19:22 -07:00
perf powerpc/64s: Remove POWER9 DD1 support 2018-07-16 11:37:21 +10:00
platforms powerpc/pseries/mm: Improve error reporting on HCALL failures 2018-07-24 22:03:19 +10:00
purgatory License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
sysdev powerpc/mpic: Pass first free vector number to mpic_setup_error_int() 2018-07-19 21:58:09 +10:00
tools powerpc/kbuild: move -mprofile-kernel check to Kconfig 2018-06-11 09:16:29 +09:00
xmon Merge branch 'topic/ppc-kvm' into next 2018-07-19 14:37:57 +10:00
Kconfig powerpc: Enable kernel XZ compression option on BOOK3S_32 2018-07-04 22:41:10 +10:00
Kconfig.debug powerpc: Add new kconfig CONFIG_PPC_IRQ_SOFT_MASK_DEBUG 2018-01-19 22:37:03 +11:00
Makefile powerpc: Add ppc64le and ppc64_book3e allmodconfig targets 2018-07-24 22:03:16 +10:00
Makefile.postlink License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00