Crap... (Literally)
This commit is contained in:
parent
7b85fde03c
commit
3004db59e4
12 changed files with 0 additions and 803 deletions
|
@ -1 +0,0 @@
|
|||
cpuburn-1.4
|
|
@ -1,118 +0,0 @@
|
|||
I wrote these programs to fill a vacuum. Chris Brady's memtest-86 is
|
||||
an excellent program for testing memory, but I wanted something that
|
||||
would do stability testing for CPUs since I had decided to overclock
|
||||
my pair of Celeron 366's on an Abit BP-6 motherboard. No comments from
|
||||
the peanut gallery. burnBX was added to test RAM & controller stability
|
||||
|
||||
Other than much vilified overclockers, other people may find these
|
||||
programs useful. System builders may wish to test their systems and
|
||||
heatsinks. PC buyers may wish to test their systems, particularly if
|
||||
they have doubts about the builder's expertise. Leaving out thermal
|
||||
interface material (grease) on the heatsink is a likely flaw.
|
||||
|
||||
The usual advice is to run kernel compiles. This is dangerous since a
|
||||
crash will certainly corrupt the filesystem with all the files make -j 4
|
||||
will have open. Worse, I doubt that gcc has any significant FPU code.
|
||||
Worse still, gcc is compiled with gcc, and I doubted that it would
|
||||
produce highly optimized code.
|
||||
|
||||
Since I couldn't find anything, I decided to write it.
|
||||
|
||||
It's certain that Intel and other CPU manufacturers have devoted enormous
|
||||
effort to CPU testing. They have some programs for stability testing
|
||||
and parts speed rating ("binning"). Some of these (HIPWR30.EXE) are
|
||||
available to qualified Intel customers under NDA.
|
||||
|
||||
I wanted a program that would load the CPU to maximum. Unintentionally,
|
||||
code optimization does this. I chose a base of FPU code (DDOT) since
|
||||
I believed from 8087 days that the FPU consumes alot of current, and
|
||||
was untested by gcc. Then integer instructions were slipped into their
|
||||
shadow to try to keep the other P6 ports loaded. Agner Fog's excellent
|
||||
article helped quite a bit. Trial and much error.
|
||||
|
||||
I also tried to chose data (all-bits-lit) that would maximize power
|
||||
consumption. But I do not claim that my code is the most optimized
|
||||
nor the most power consuming. There could always be better.
|
||||
|
||||
Once I found lm-sensors, I could measure the results of my efforts.
|
||||
Subject thermister vagaries, here are my results [revised]:
|
||||
|
||||
29'C at idle (hlt)
|
||||
41' doing idle loop
|
||||
46' mprime95 (as-is or reniced -19)
|
||||
47' make -j 4 on kernel
|
||||
47' 2 * burnP5 (estimated)
|
||||
47' 2 * burnBX L (default, 4 MB)
|
||||
48' 2 * burnMMX L
|
||||
48' 2 * burnK6 (estimated)
|
||||
50' 2 * burnMMX F (default, 64 kB in L2)
|
||||
51' 2 * burnMMX D (16 kB, L1 cache)
|
||||
51' 2 * burnP6 on zeroes for data
|
||||
52' 2 * burnP6 with FF's for data
|
||||
|
||||
All at 2 * 5.5 * 97 MHz (26'C ambient). Higher and my CPU1 will lockup
|
||||
under burnP6 in 5-10 min . kernel compiles are stable to 99 MHz for
|
||||
24 h. But 98 MHz will give `burnBX` errors every 5-8 hours, and 95
|
||||
MHz will give burnMMX D errors every ~6 hours, so now I run 94 MHz.
|
||||
Errors seem to increase 10x for every 1 MHz.
|
||||
|
||||
I got tired of waiting for temperature steady-state so I measure current
|
||||
instead. Mostly I use the ATX power harness as a shunt, and measure
|
||||
current by voltage drop. Email for details. This permits testng many
|
||||
different instruction mix ideas quickly. As it turns out, the orignal
|
||||
burnP6 is close to the best I've found, needing only minor tweaking for
|
||||
a 2% improvement. The optimum burnK6 is also fairly similar, with just
|
||||
minor architectural adjustments for AMD.
|
||||
|
||||
I also did some measurements with an inductive ammeter. They gave 90% of
|
||||
the estimated maximum datasheet current draw for burnP6. So I'm fairly
|
||||
happy with the code. But suggestions for improvement are most welcome.
|
||||
I don't claim this code is perfect, nor that it will catch all system
|
||||
deficiencies.
|
||||
|
||||
BURNBX: This program has been quite frustrating to develop. It's hard to
|
||||
measure the results. I've finally hit on a reasonable pattern (walking
|
||||
bit through carry, inverted every quadword except for cacheline leadoff)
|
||||
that really brings out errors, and occasional lockups (more on FreeBSD).
|
||||
The 82443BX only gets to 42'C.
|
||||
|
||||
Essentially, burnBX is a RAM tester, using whatever pages the OS allocates
|
||||
to the process. As such, it cannot test kernel RAM. But it is designed
|
||||
to be very intense, using the P6 optimized `rep movsd` instructions.
|
||||
Please note that burnBX is _not_ optimal on AMD K6 based systems because
|
||||
they don't have the optimized `rep mosvd` block move.
|
||||
|
||||
Beta testers have mostly reported quick error terminations. Their impact
|
||||
should not be minimized, because such a data error could occur in kernel
|
||||
code, causing system crashes. The errors may be from the CPU/BX bus, in
|
||||
which case ECC RAM will not help. The cause is not perfectly clear, but
|
||||
general case & 440BX cooling helps and so does an adequate powersupply.
|
||||
300W is suggested.
|
||||
|
||||
Errors on my "instrumented" version of burnBX have not been isolated
|
||||
to one memory cell but have been distributed across many addresses and
|
||||
a few bits [only one at a time]. It is suspected that there is a bus
|
||||
or transistor driver problem. Or there may be undetected transients in
|
||||
the 3.3 voltage.
|
||||
|
||||
REVISED BURNMMX: I started this project as simply a way for AMD system
|
||||
owners to check out their systems. I was very surpised when my own
|
||||
system started throwing errors with the MMX memory moves, and had to
|
||||
downclock from 2 * 5.5 * 97 MHz to 94 MHz. It would seem that the simple
|
||||
memory moves are more fragile (less robust to interrupts) than the 2%
|
||||
higher bandwidth string moves.
|
||||
|
||||
BURNK7: I finally bought an AMD Athlon and had to write a tester even
|
||||
though I don't overclock it. Writing burnK7 was much trial and error,
|
||||
but the ammeter gave me immediate feedback on my efforts. The powerful
|
||||
K7 core was easy and fun to optimize. I parallel pathed DDOT to remove a
|
||||
dependancy, and could have gone much further, but current didn't increase,
|
||||
so I stuffed in integer instructions which did increase current. On my
|
||||
850 Thunderbird, burnK7 draws 9% more power than burnK6.
|
||||
|
||||
|
||||
Robert Redelmeier redelm@ev1.net June 15, 2001
|
||||
|
||||
|
||||
|
||||
|
|
@ -1,3 +0,0 @@
|
|||
all : burnP5 burnP6 burnK6 burnK7 burnBX burnMMX
|
||||
.S:
|
||||
gcc -s -nostdlib -o $@ $<
|
|
@ -1,77 +0,0 @@
|
|||
N E W burnK7 for the AMD Athlon/Duron has been released.
|
||||
|
||||
These programs are designed to load x86 CPUs as heavily as possible for
|
||||
the purposes of system testing. They have been optimized for different
|
||||
processors. FPU and ALU instructions are coded an assembler endless loop.
|
||||
They do not test every instruction. The goal has been to maximize heat
|
||||
production from the CPU, putting stress on the CPU itself, cooling
|
||||
system, motherboard (especially voltage regulators) and power supply
|
||||
(likely cause of burnBX/MMX errors).
|
||||
|
||||
burnP5 is optimized for Intel Pentium w&w/o MMX processors
|
||||
P6 is for Intel PentiumPro, PentiumII&III and Celeron CPUs
|
||||
K6 is for AMD K6 processors
|
||||
K7 is for AMD Athlon/Duron processors
|
||||
MMX is to test cache/memory interfaces on all CPUs with MMX
|
||||
BX is an alternate cache/memory test for Intel CPUs
|
||||
|
||||
TO USE: root priviliges are NOT required. It has been designed for ELF
|
||||
Linux, but also tested under FreeBSD. and a.out. Burn Testing
|
||||
is best done from a ramdisk distribution (tomsrtbt) or with
|
||||
filesystems unmounted or mounted read-only. untar the source
|
||||
in a convenient directory:
|
||||
`tar zxf cpuburn`
|
||||
compile excutables
|
||||
`make`
|
||||
run desired program in background [ _repeat_ for SMP]:
|
||||
`burnP6 || echo $? &`
|
||||
|
||||
Monitor progress of cpuburn by `ps`. When finished, `kill` the burn*
|
||||
process(es). If you have temperature probes (fingers) or the lm-sensors
|
||||
package, you can check your CPU temperature and/or system voltages.
|
||||
|
||||
If an error occurs in calculations, it will be preserved, and the
|
||||
program will terminate with error code 254 for an integer/memory error,
|
||||
and error code 255 for a FP/MMX error. Error checking happens every
|
||||
10-40 sec for burnP6/K6/K7 and I haven't seen any CPU errors in testing
|
||||
[lockups occur first]. burnBX and burnMMX check for error every 512 MB
|
||||
(4-10 sec), and error termination is frequently seen, lockups are rarer.
|
||||
|
||||
burnBX and burnMMX are essentially very intense RAM testers. They can
|
||||
also take an optional parameter indicating the RAM size to be tested:
|
||||
|
||||
A = 2 kB E = 32 kB I = 512 kB M = 8 MB
|
||||
B = 4 F = 64 J = 1 MB N = 16
|
||||
C = 8 G = 128 K = 2 O = 32
|
||||
D = 16 H = 256 L = 4 P = 64
|
||||
|
||||
`burnBX L` (4 MB) and `burnMMX F` (64 kB) are the default sizes.
|
||||
A-E mostly test L1 cache, F-H test L2 cache, and H-P force their way
|
||||
to RAM. But even A-E will have some cacheline writeouts to RAM.
|
||||
|
||||
In spite of it's name, burnBX can be run on any chipset [RAM controller]
|
||||
and tests alot more than the RAM controller. Unfortunately, burnBX is
|
||||
not optimal on AMD processors. burnMMX is preferable for any CPU that
|
||||
has an MMX unit.
|
||||
|
||||
burnBX/MMX needs about 72 MB of total RAM + swap to start (not necessarily
|
||||
free), but doesn't use this much unless you request it. They will
|
||||
throw a `Sig 11` if you don't have enough swap. If you don't want to
|
||||
add more, you can adjust the .bss section downward as indicated in the
|
||||
source comments. I use very simple memory management. They can also
|
||||
test swap, and at least on my system, I can run 2*`burnBX 8` with 128
|
||||
MB SDRAM with some use of swap, but no excessive thrashing[seeks]. YMMV.
|
||||
|
||||
If sub-spec, your system may lock up after 2-10 minutes. It shouldn't.
|
||||
burn* are just an unpriviliged user processes. But it probably means
|
||||
your CPU is undercooled, most likely no thermal grease or other interface
|
||||
material between CPU & heatsink. Or some other deficiency. A power
|
||||
cycle should reset the system. But you should fix it.
|
||||
|
||||
Robert Redelmeier
|
||||
redelm@ev1.net
|
||||
|
||||
*** WARNING *** This program is designed to heavily load CPU chips.
|
||||
Undercooled, overclocked or otherwise weak systems may fail causing data
|
||||
loss (filesystem corruption) and possibly permanent damage to electronic
|
||||
components. Nor will it catch all flaws. *** USE AT YOUR OWN RISK ***
|
|
@ -1,124 +0,0 @@
|
|||
# cpuburn-1.4: burnBX Chipset/DRAM Loading Utility
|
||||
# Copyright 2000 Robert J. Redelmeier. All Right Reserved
|
||||
# Licensed under GNU General Public Licence 2.0. No warrantee.
|
||||
# *** USE AT YOUR OWN RISK ***
|
||||
|
||||
.text
|
||||
#ifdef WINDOWS
|
||||
.globl _main
|
||||
_main:
|
||||
movl 4(%esp),%eax
|
||||
movl $12, %ecx # default L = 4 MB
|
||||
subl $1,%eax # 1 string -> no paramater
|
||||
jz no_size
|
||||
|
||||
movl 8(%esp),%eax # address of strings
|
||||
movl 4(%eax),%eax # address of first paramater
|
||||
movzb (%eax),%ecx # first parameter - a byte
|
||||
no_size:
|
||||
subl $12, %esp # stack allocation
|
||||
#else
|
||||
.globl _start
|
||||
_start:
|
||||
subl $12, %esp #stack space
|
||||
movl 20(%esp), %eax
|
||||
movl $12, %ecx # default L = 4 MB
|
||||
testl %eax, %eax # is a param given?
|
||||
jz no_size
|
||||
movl (%eax), %ecx
|
||||
no_size:
|
||||
#endif
|
||||
decl %ecx
|
||||
andl $15, %ecx
|
||||
movl $256, %eax
|
||||
shll %cl, %eax
|
||||
movl %eax, 4(%esp) # save blocksize
|
||||
movl $256*1024, %eax
|
||||
shrl %cl, %eax
|
||||
movl %eax, 8(%esp) # save count blks / 512 MB
|
||||
|
||||
movl 4(%esp), %ecx
|
||||
shrl $4, %ecx
|
||||
movl $buffer, %edi
|
||||
xorl %eax, %eax
|
||||
notl %eax
|
||||
more: # init fill of 2 cachelines
|
||||
movl %eax, %edx # qwords F-F-0-F , F-0-F-0
|
||||
notl %edx
|
||||
movl %eax, 0(%edi)
|
||||
movl %eax, 4(%edi)
|
||||
movl %eax, 8(%edi)
|
||||
movl %eax, 12(%edi)
|
||||
movl %edx, 16(%edi)
|
||||
movl %edx, 20(%edi)
|
||||
movl %eax, 24(%edi)
|
||||
movl %eax, 28(%edi)
|
||||
|
||||
movl %eax, 32(%edi)
|
||||
movl %eax, 36(%edi)
|
||||
movl %edx, 40(%edi)
|
||||
movl %edx, 44(%edi)
|
||||
movl %eax, 48(%edi)
|
||||
movl %eax, 52(%edi)
|
||||
movl %edx, 56(%edi)
|
||||
movl %edx, 60(%edi)
|
||||
rcll $1, %eax # walking zero, 33 cycle
|
||||
leal 64(%edi), %edi # odd inst to preserve CF
|
||||
decl %ecx
|
||||
jnz more
|
||||
|
||||
cld
|
||||
thrash: # MAIN LOOP
|
||||
movl 8(%esp), %edx
|
||||
mov_again:
|
||||
movl $buffer, %esi
|
||||
movl $buf2, %edi
|
||||
movl 4(%esp), %ecx
|
||||
rep # move block up
|
||||
movsl
|
||||
|
||||
movl $buffer + 32, %edi
|
||||
movl $buf2, %esi
|
||||
movl 4(%esp), %ecx
|
||||
subl $8, %ecx
|
||||
rep # move block back shifting
|
||||
movsl # by 1 cacheline
|
||||
|
||||
movl $buffer, %edi
|
||||
movl $8, %ecx
|
||||
rep # replace last c line
|
||||
movsl
|
||||
|
||||
decl %edx # do again for 512 MB.
|
||||
jnz mov_again
|
||||
|
||||
movl $buffer, %edi # DATA CHECK
|
||||
xorl %ecx, %ecx
|
||||
.align 16, 0x90
|
||||
test:
|
||||
mov 0(%edi,%ecx,4), %eax
|
||||
cmp %eax, 4(%edi,%ecx,4)
|
||||
jnz error
|
||||
incl %ecx
|
||||
incl %ecx
|
||||
cmpl 4(%esp), %ecx
|
||||
jc test
|
||||
jmp thrash
|
||||
|
||||
error: # error abend
|
||||
movl $1, %eax
|
||||
#ifdef WINDOWS
|
||||
addl $12, %esp # deallocate stack
|
||||
ret
|
||||
#else
|
||||
movl $-2, %ebx
|
||||
pushl %ebx # *BSD syscall convention
|
||||
pushl %eax
|
||||
int $0x80
|
||||
#endif
|
||||
.bss # Data allocation
|
||||
.align 32
|
||||
.lcomm buffer, 32 <<20 # reduce both to 8 <<20 for only
|
||||
.lcomm buf2, 32 <<20 # 16 MB virtual memory available
|
||||
|
||||
#
|
|
@ -1,76 +0,0 @@
|
|||
# cpuburn-1.4: burnK6 CPU Loading Utility
|
||||
# Copyright 1999 Robert J. Redelmeier. All Right Reserved
|
||||
# Licensed under GNU General Public Licence 2.0. No warrantee.
|
||||
# *** USE AT YOUR OWN RISK ***
|
||||
|
||||
.text
|
||||
#ifdef WINDOWS
|
||||
.globl _main
|
||||
_main:
|
||||
#else
|
||||
.globl _start
|
||||
_start:
|
||||
#endif
|
||||
finit
|
||||
pushl %ebp
|
||||
movl %esp, %ebp
|
||||
andl $-32, %ebp
|
||||
subl $96, %esp
|
||||
fldpi
|
||||
fldl rt
|
||||
fstpl -24(%ebp)
|
||||
fldl e
|
||||
fstpl -32(%ebp)
|
||||
movl half, %edx
|
||||
movl %edx, -8(%ebp)
|
||||
after_check:
|
||||
xorl %eax, %eax
|
||||
movl %eax, %ebx
|
||||
lea -1(%eax), %esi
|
||||
movl $400000000, %ecx
|
||||
movl %ecx, -4(%ebp)
|
||||
.align 32, 0x90
|
||||
crunch:
|
||||
fldl 8-24(%ebp,%esi,8) # CALC BLOCK
|
||||
fmull 8-32(%ebp,%esi,8)
|
||||
addl half+9(%esi,%esi,8), %edx
|
||||
jnz . + 2
|
||||
faddp
|
||||
fldl 8-24(%ebp,%esi,8)
|
||||
decl %ebx
|
||||
subl half+9(%esi,%esi,8), %edx
|
||||
jmp . + 2
|
||||
fmull 8-32(%ebp,%esi,8)
|
||||
incl %ebx
|
||||
decl 8-4(%ebp,%esi,8)
|
||||
fsubp
|
||||
jnz crunch # time for testing ?
|
||||
|
||||
test %ebx, %ebx # TEST BLOCK
|
||||
jnz int_exit
|
||||
cmpl half, %edx
|
||||
jnz int_exit
|
||||
fldpi
|
||||
fcomp %st(1)
|
||||
fstsw %ax
|
||||
sahf
|
||||
jz after_check
|
||||
decl %ebx
|
||||
int_exit:
|
||||
decl %ebx
|
||||
addl $96, %esp
|
||||
popl %ebp
|
||||
movl $1, %eax
|
||||
#ifdef WINDOWS
|
||||
ret
|
||||
#else
|
||||
push %ebx
|
||||
push %eax
|
||||
int $0x80
|
||||
#endif
|
||||
.align 32,0
|
||||
half: .long 0x7fffffff,0
|
||||
e: .long 0xffffffff,0x3fdfffff
|
||||
rt: .long 0xffffffff,0x3fefffff
|
||||
|
||||
|
|
@ -1,83 +0,0 @@
|
|||
# cpuburn-1.4: burnK7 CPU Loading Utility
|
||||
# Copyright 2000 Robert J. Redelmeier. All Right Reserved
|
||||
# Licensed under GNU General Public Licence 2.0. No warrantee.
|
||||
# *** USE AT YOUR OWN RISK ***
|
||||
|
||||
.text
|
||||
#ifdef WINDOWS
|
||||
.globl _main
|
||||
_main:
|
||||
#else
|
||||
.globl _start
|
||||
_start:
|
||||
#endif
|
||||
finit
|
||||
pushl %ebp
|
||||
movl %esp, %ebp
|
||||
andl $-32, %ebp
|
||||
subl $96, %esp
|
||||
fldl rt
|
||||
fstpl -24(%ebp)
|
||||
fldl e
|
||||
fstpl -32(%ebp)
|
||||
fldpi
|
||||
fldpi
|
||||
xorl %eax, %eax
|
||||
xorl %ebx, %ebx
|
||||
xorl %ecx, %ecx
|
||||
movl half, %edx
|
||||
lea -1(%eax), %esi
|
||||
movl %eax, -12(%ebp)
|
||||
movl %edx, -8(%ebp)
|
||||
after_check:
|
||||
movl $850000000, -4(%ebp)
|
||||
.align 32, 0x90
|
||||
crunch:
|
||||
fxch # CALC BLOCK
|
||||
fldl 8-24(%ebp,%esi,8) # 17 instr / 6.0 cycles
|
||||
addl half+9(%esi,%esi,8), %edx
|
||||
fmull 8-32(%ebp,%esi,8)
|
||||
faddp
|
||||
decl %ecx
|
||||
fldl 8-24(%ebp,%esi,8)
|
||||
decl %ebx
|
||||
incl 8-12(%ebp,%esi,8)
|
||||
subl half+9(%esi,%esi,8), %edx
|
||||
incl %ecx
|
||||
fmull 8-32(%ebp,%esi,8)
|
||||
incl %ebx
|
||||
decl 8-4(%ebp,%esi,8)
|
||||
jmp . + 2
|
||||
fsubp %st, %st(2)
|
||||
jnz crunch # time for testing ?
|
||||
|
||||
test %ebx, %ebx # TEST BLOCK
|
||||
jnz int_exit
|
||||
test %ecx, %ecx
|
||||
jnz int_exit
|
||||
cmpl half, %edx
|
||||
jnz int_exit
|
||||
fcom %st(1)
|
||||
fstsw %ax
|
||||
sahf
|
||||
jz after_check
|
||||
decl %ebx
|
||||
int_exit:
|
||||
decl %ebx
|
||||
addl $96, %esp
|
||||
popl %ebp
|
||||
movl $1, %eax
|
||||
#ifdef WINDOWS
|
||||
ret
|
||||
#else
|
||||
push %ebx
|
||||
push %eax
|
||||
int $0x80
|
||||
#endif
|
||||
.align 32,0
|
||||
.fill 64
|
||||
half: .long 0x7fffffff,0
|
||||
e: .long 0xffffffff,0x3fdfffff
|
||||
rt: .long 0xffffffff,0x3fefffff
|
||||
|
||||
|
|
@ -1,161 +0,0 @@
|
|||
# cpuburn-1.4: burnMMX Chipset/DRAM Loading Utility
|
||||
# Copyright 2000 Robert J. Redelmeier. All Right Reserved
|
||||
# Licensed under GNU General Public Licence 2.0. No warrantee.
|
||||
# *** USE AT YOUR OWN RISK ***
|
||||
|
||||
.text
|
||||
#ifdef WINDOWS
|
||||
.globl _main
|
||||
_main:
|
||||
movl 4(%esp),%eax
|
||||
movl $6, %ecx # default f = 64 kB
|
||||
subl $1, %eax # is a param given?
|
||||
jz no_size
|
||||
|
||||
movl 8(%esp),%eax # address of strings
|
||||
movl 4(%eax),%eax # address of first paramater
|
||||
movzb (%eax),%ecx # first parameter - a byte
|
||||
no_size:
|
||||
subl $12, %esp # stack space
|
||||
#else
|
||||
.globl _start
|
||||
_start:
|
||||
subl $12, %esp
|
||||
movl 20(%esp), %eax
|
||||
movl $6, %ecx # default f = 64 kB
|
||||
testl %eax, %eax # is a param given?
|
||||
jz no_size
|
||||
movl (%eax), %ecx
|
||||
no_size:
|
||||
#endif
|
||||
emms
|
||||
movq rt, %mm0
|
||||
decl %ecx
|
||||
andl $15, %ecx # mask off ASCII bits
|
||||
movl $256, %eax
|
||||
shll %cl, %eax
|
||||
movl %eax, 4(%esp) # save blocksize
|
||||
movl $256*1024, %eax
|
||||
shrl %cl, %eax
|
||||
movl %eax, 8(%esp) # save count blks / 512 MB
|
||||
|
||||
movl 4(%esp), %ecx # initial fill of 2 cachelines
|
||||
shrl $4, %ecx
|
||||
movl $buffer, %edi
|
||||
xorl %eax, %eax
|
||||
notl %eax
|
||||
more:
|
||||
movl %eax, %edx # qwords F-F-0-F , F-0-F-0
|
||||
notl %edx
|
||||
movl %eax, 0(%edi)
|
||||
movl %eax, 4(%edi)
|
||||
movl %eax, 8(%edi)
|
||||
movl %eax, 12(%edi)
|
||||
movl %edx, 16(%edi)
|
||||
movl %edx, 20(%edi)
|
||||
movl %eax, 24(%edi)
|
||||
movl %eax, 28(%edi)
|
||||
|
||||
movl %eax, 32(%edi)
|
||||
movl %eax, 36(%edi)
|
||||
movl %edx, 40(%edi)
|
||||
movl %edx, 44(%edi)
|
||||
movl %eax, 48(%edi)
|
||||
movl %eax, 52(%edi)
|
||||
movl %edx, 56(%edi)
|
||||
movl %edx, 60(%edi)
|
||||
rcll $1, %eax # walking zero, 33 cycle
|
||||
leal 64(%edi), %edi # odd inst to preserve CF
|
||||
decl %ecx
|
||||
jnz more
|
||||
|
||||
thrash: # OUTER LOOP
|
||||
movl 8(%esp), %edx # reset count for 512 MB
|
||||
mov_again:
|
||||
movq %mm0, %mm1
|
||||
movq %mm0, %mm2
|
||||
movl $buffer, %esi
|
||||
movl $buf2, %edi
|
||||
movl 4(%esp), %ecx
|
||||
shll $2, %ecx # move block up
|
||||
addl %ecx, %esi
|
||||
addl %ecx, %edi
|
||||
negl %ecx
|
||||
.align 16, 0x90
|
||||
0: # WORKLOOP 7 uops/ 3 clks in L1
|
||||
movq 0(%esi,%ecx),%mm7
|
||||
pmaddwd %mm0, %mm1
|
||||
pmaddwd %mm0, %mm2
|
||||
movq %mm7, 0(%edi,%ecx)
|
||||
addl $8, %ecx
|
||||
jnz 0b
|
||||
|
||||
movl $buffer + 32, %edi # move block back
|
||||
movl $buf2, %esi # shifting by
|
||||
movl 4(%esp), %ecx # one cacheline
|
||||
subl $8, %ecx
|
||||
shll $2, %ecx
|
||||
addl %ecx, %esi
|
||||
addl %ecx, %edi
|
||||
negl %ecx
|
||||
.align 16, 0x90
|
||||
0: # second workloop
|
||||
movq 0(%esi,%ecx),%mm7
|
||||
pmaddwd %mm0, %mm1
|
||||
pmaddwd %mm0, %mm2
|
||||
movq %mm7, 0(%edi,%ecx)
|
||||
addl $8, %ecx
|
||||
jnz 0b
|
||||
|
||||
movl $buffer, %edi
|
||||
movsl # replace last c line
|
||||
movsl
|
||||
movsl
|
||||
movsl
|
||||
movsl
|
||||
movsl
|
||||
movsl
|
||||
movsl
|
||||
decl %edx # do again for 512 MB.
|
||||
jnz mov_again
|
||||
|
||||
xorl %ebx ,%ebx # DATA CHECK
|
||||
decl %ebx
|
||||
pcmpeqd %mm2, %mm1
|
||||
psrlq $16, %mm1
|
||||
movd %mm1, %eax
|
||||
incl %eax
|
||||
jnz error # MMX calcs OK?
|
||||
|
||||
decl %ebx
|
||||
subl $32, %edi
|
||||
xorl %ecx, %ecx
|
||||
test: # Check data (NOT optimized)
|
||||
mov 0(%edi,%ecx,4), %eax
|
||||
cmp %eax, 4(%edi,%ecx,4)
|
||||
jnz error
|
||||
incl %ecx
|
||||
incl %ecx
|
||||
cmpl 4(%esp), %ecx
|
||||
jc test
|
||||
jmp thrash
|
||||
|
||||
error: # error abend
|
||||
emms
|
||||
movl $1, %eax
|
||||
#ifdef WINDOWS
|
||||
addl $12, %esp # deallocate stack
|
||||
ret
|
||||
#else
|
||||
push %ebx
|
||||
push %eax
|
||||
int $0x80
|
||||
#endif
|
||||
rt: .long 0x7fffffff, 0x7fffffff
|
||||
|
||||
.bss # Data allocation
|
||||
.align 32
|
||||
.lcomm buffer, 32 <<20 # reduce both to 8 <<20 for only
|
||||
.lcomm buf2, 32 <<20 # 16 MB virtual memory available
|
||||
|
||||
#
|
|
@ -1,83 +0,0 @@
|
|||
# cpuburn-1.4: burnP5 CPU Loading Utility
|
||||
# Copyright 1999 Robert J. Redelmeier. All Right Reserved
|
||||
# Licensed under GNU General Public Licence 2.0. No warrantee.
|
||||
# *** USE AT YOUR OWN RISK ***
|
||||
|
||||
.text
|
||||
#ifdef WINDOWS
|
||||
.globl _main
|
||||
_main:
|
||||
#else
|
||||
.globl _start
|
||||
_start:
|
||||
#endif
|
||||
finit
|
||||
pushl %ebp
|
||||
movl %esp, %ebp
|
||||
andl $-32, %ebp
|
||||
subl $96, %esp
|
||||
fldl half
|
||||
fstpl -24(%ebp)
|
||||
fldl one
|
||||
fstl -16(%ebp)
|
||||
fld %st
|
||||
fld %st
|
||||
after_check:
|
||||
xorl %eax, %eax
|
||||
movl %eax, %ebx
|
||||
movl $200000000, %ecx
|
||||
.align 32, 0x90
|
||||
# MAIN LOOP 16 flops / 18 cycles
|
||||
crunch:
|
||||
fmull -24(%ebp)
|
||||
fxch %st(1)
|
||||
faddl -16(%ebp)
|
||||
fxch %st(2)
|
||||
fmull -24(%ebp)
|
||||
fxch %st(1)
|
||||
faddl -16(%ebp)
|
||||
fxch %st(2)
|
||||
|
||||
fmull -24(%ebp)
|
||||
fxch %st(1)
|
||||
faddl -16(%ebp)
|
||||
fxch %st(2)
|
||||
fmull -24(%ebp)
|
||||
fxch %st(1)
|
||||
faddl -16(%ebp)
|
||||
fxch %st(2)
|
||||
|
||||
fmull -24(%ebp)
|
||||
fxch %st(1)
|
||||
faddl -16(%ebp)
|
||||
fxch %st(2)
|
||||
fmull -24(%ebp)
|
||||
fxch %st(1)
|
||||
faddl -16(%ebp)
|
||||
fxch %st(2)
|
||||
|
||||
fmull -24(%ebp)
|
||||
fxch %st(1)
|
||||
faddl -16(%ebp)
|
||||
fxch %st(2)
|
||||
fmull -24(%ebp)
|
||||
fxch %st(1)
|
||||
faddl -16(%ebp)
|
||||
fxch %st(2)
|
||||
|
||||
decl %ecx
|
||||
jnz crunch
|
||||
|
||||
jmp after_check
|
||||
addl $96, %esp # never reached
|
||||
popl %ebp # no checking done
|
||||
movl $1, %eax
|
||||
#ifdef WINDOWS
|
||||
ret
|
||||
#else
|
||||
int $0x80
|
||||
#endif
|
||||
.align 32,0
|
||||
half: .long 0xffffffff,0x3fdfffff
|
||||
one: .long 0xffffffff,0x3fefffff
|
||||
|
|
@ -1,77 +0,0 @@
|
|||
# cpuburn-1.4: burnP6 CPU Loading Utility
|
||||
# Copyright 1999 Robert J. Redelmeier. All Right Reserved
|
||||
# Licensed under GNU General Public Licence 2.0. No warrantee.
|
||||
# *** USE AT YOUR OWN RISK ***
|
||||
|
||||
.text
|
||||
#ifdef WINDOWS
|
||||
.globl _main
|
||||
_main:
|
||||
#else
|
||||
.globl _start
|
||||
_start:
|
||||
#endif
|
||||
finit
|
||||
pushl %ebp
|
||||
movl %esp, %ebp
|
||||
andl $-32, %ebp
|
||||
subl $96, %esp
|
||||
fldpi
|
||||
fldl rt
|
||||
fstpl -24(%ebp)
|
||||
fldl e
|
||||
fstpl -32(%ebp)
|
||||
movl half, %edx
|
||||
movl %edx, -8(%ebp)
|
||||
after_check:
|
||||
xorl %eax, %eax
|
||||
movl %eax, %ebx
|
||||
lea -1(%eax), %esi
|
||||
movl $539000000, %ecx # check after this count
|
||||
movl %ecx, -4(%ebp)
|
||||
.align 32, 0x90
|
||||
crunch: # MAIN LOOP 21uops / 8.0 clocks
|
||||
fldl 8-24(%ebp,%esi,8)
|
||||
fmull 8-32(%ebp,%esi,8)
|
||||
addl half, %edx
|
||||
jnz . + 2
|
||||
faddp
|
||||
fldl -24(%ebp)
|
||||
decl %ebx
|
||||
subl half+9(%esi,%esi,8), %edx
|
||||
jmp . + 2
|
||||
fmull 8-32(%ebp,%esi,8)
|
||||
incl %ebx
|
||||
decl 8-4(%ebp,%esi,8)
|
||||
fsubp
|
||||
jnz crunch
|
||||
|
||||
test %ebx, %ebx # Testing block
|
||||
mov $0, %ebx
|
||||
jnz int_exit
|
||||
cmpl half, %edx
|
||||
jnz int_exit
|
||||
fldpi
|
||||
fcomp %st(1)
|
||||
fstsw %ax
|
||||
sahf
|
||||
jz after_check # fp result = pi ?
|
||||
decl %ebx
|
||||
int_exit: # error abort
|
||||
decl %ebx
|
||||
addl $96, %esp
|
||||
popl %ebp
|
||||
movl $1, %eax # Linux syscall
|
||||
#ifdef WINDOWS
|
||||
ret
|
||||
#else
|
||||
push %ebx
|
||||
push %eax # *BSD syscall
|
||||
int $0x80
|
||||
#endif
|
||||
.align 32,0
|
||||
half: .long 0x7fffffff,0
|
||||
e: .long 0xffffffff,0x3fdfffff
|
||||
rt: .long 0xffffffff,0x3fefffff
|
||||
#
|
||||
|
Loading…
Reference in a new issue