11d4bb1bd0
Currently mx53 (CortexA8) running at 1GHz reports: Calibrating delay loop... 663.55 BogoMIPS (lpj=3317760) Tom Evans verified that alignments of 0x0 and 0x8 run the two instructions of __loop_delay in one clock cycle (1 clock/loop), while alignments of 0x4 and 0xc take 3 clocks to run the loop twice. (1.5 clock/loop) The original object code looks like this: 00000010 <__loop_const_udelay>: 10: e3e01000 mvn r1, #0 14: e51f201c ldr r2, [pc, #-28] ; 0 <__loop_udelay-0x8> 18: e5922000 ldr r2, [r2] 1c: e0800921 add r0, r0, r1, lsr #18 20: e1a00720 lsr r0, r0, #14 24: e0822b21 add r2, r2, r1, lsr #22 28: e1a02522 lsr r2, r2, #10 2c: e0000092 mul r0, r2, r0 30: e0800d21 add r0, r0, r1, lsr #26 34: e1b00320 lsrs r0, r0, #6 38: 01a0f00e moveq pc, lr 0000003c <__loop_delay>: 3c: e2500001 subs r0, r0, #1 40: 8afffffe bhi 3c <__loop_delay> 44: e1a0f00e mov pc, lr After adding the 'align 3' directive to __loop_delay (align to 8 bytes): 00000010 <__loop_const_udelay>: 10: e3e01000 mvn r1, #0 14: e51f201c ldr r2, [pc, #-28] ; 0 <__loop_udelay-0x8> 18: e5922000 ldr r2, [r2] 1c: e0800921 add r0, r0, r1, lsr #18 20: e1a00720 lsr r0, r0, #14 24: e0822b21 add r2, r2, r1, lsr #22 28: e1a02522 lsr r2, r2, #10 2c: e0000092 mul r0, r2, r0 30: e0800d21 add r0, r0, r1, lsr #26 34: e1b00320 lsrs r0, r0, #6 38: 01a0f00e moveq pc, lr 3c: e320f000 nop {0} 00000040 <__loop_delay>: 40: e2500001 subs r0, r0, #1 44: 8afffffe bhi 40 <__loop_delay> 48: e1a0f00e mov pc, lr 4c: e320f000 nop {0} , which now reports: Calibrating delay loop... 996.14 BogoMIPS (lpj=4980736) Some more test results: On mx31 (ARM1136) running at 532 MHz, before the patch: Calibrating delay loop... 351.43 BogoMIPS (lpj=1757184) On mx31 (ARM1136) running at 532 MHz after the patch: Calibrating delay loop... 528.79 BogoMIPS (lpj=2643968) Also tested on mx6 (CortexA9) and on mx27 (ARM926), which shows the same BogoMIPS value before and after this patch. Reported-by: Tom Evans <tom_usenet@optusnet.com.au> Suggested-by: Tom Evans <tom_usenet@optusnet.com.au> Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> |
||
---|---|---|
.. | ||
boot | ||
common | ||
configs | ||
crypto | ||
include | ||
kernel | ||
kvm | ||
lib | ||
mach-at91 | ||
mach-bcm | ||
mach-bcm2835 | ||
mach-clps711x | ||
mach-cns3xxx | ||
mach-davinci | ||
mach-dove | ||
mach-ebsa110 | ||
mach-ep93xx | ||
mach-exynos | ||
mach-footbridge | ||
mach-gemini | ||
mach-highbank | ||
mach-imx | ||
mach-integrator | ||
mach-iop13xx | ||
mach-iop32x | ||
mach-iop33x | ||
mach-ixp4xx | ||
mach-keystone | ||
mach-kirkwood | ||
mach-ks8695 | ||
mach-lpc32xx | ||
mach-mmp | ||
mach-msm | ||
mach-mv78xx0 | ||
mach-mvebu | ||
mach-mxs | ||
mach-netx | ||
mach-nomadik | ||
mach-nspire | ||
mach-omap1 | ||
mach-omap2 | ||
mach-orion5x | ||
mach-picoxcell | ||
mach-prima2 | ||
mach-pxa | ||
mach-realview | ||
mach-rockchip | ||
mach-rpc | ||
mach-s3c24xx | ||
mach-s3c64xx | ||
mach-s5p64x0 | ||
mach-s5pc100 | ||
mach-s5pv210 | ||
mach-sa1100 | ||
mach-shmobile | ||
mach-socfpga | ||
mach-spear | ||
mach-sti | ||
mach-sunxi | ||
mach-tegra | ||
mach-u300 | ||
mach-ux500 | ||
mach-versatile | ||
mach-vexpress | ||
mach-virt | ||
mach-vt8500 | ||
mach-w90x900 | ||
mach-zynq | ||
mm | ||
net | ||
nwfpe | ||
oprofile | ||
plat-iop | ||
plat-omap | ||
plat-orion | ||
plat-pxa | ||
plat-samsung | ||
plat-versatile | ||
tools | ||
vfp | ||
xen | ||
Kconfig | ||
Kconfig-nommu | ||
Kconfig.debug | ||
Makefile |