07c8f163e4
Changes: * Update lightrec to latest upstream * Minimize logs when loading a cheevos-compatible content * Cleanup retro_run() - - move input query into separate functions - move internal fps display to separate function * Hide other inputs from core options - - This adds a core option to hide some input options like multitaps, player ports 3-8 and analog-related fine-tuning options. - also combine dynarec-only options in one #define directive * More core option fixes - - This PR fixes core options and moves them to the related dynarec modes where they are implemented. LIGHTREC = relates to platforms that supports the new Lightrec mode NEW_DYNAREC = relates to previous dynarec implementation that is still used for some 32bit devices - Dynarec Recompiler core option, both dynarec implementation can be enabled or disabled * Move guncon options to update_variables - - This should stop unnecessary RETRO_ENVIRONMENT_GET_VARIABLE callback and log spamming * Fix some edge case where core can freeze upon loading content * Automatically disable Lightrec when no BIOS is present, take 2 * cdriso: fix a disk switching deadlock when closing a CD image * ARM NEON: Fixed bug where MSB of a 15-bit BGR color could corrupt green value. * cdriso: fix a disk switching deadlock * unai: Add ARM-optimized lighting / blending functions Addendum on UNAI ARM-optimized lighting/blending improvements - "Looking at the generated ASM on 3DS, I thought I could squeeze out some extra performance by moving the inner lighting and blending functions to handwritten A32 assembly. This gives a medium improvement generally (3-5fps faster on the beach in Crash 1) and a large improvement when doing lots of blending (46-48fps before, 57-60fps after, behind the waterfall in Water Dragon Isle in Chrono Cross). Some other notes: * I used the ARM11 MPCore (3DS CPU) timings for pipelining. * I had a few stall cycles during lighting, so I used them to preserve the MSB for lighting and blending, which saved a store, load, and orr later on. ~3-6 cycles saved overall by doing that. * I switched from u16 to uint_fast16_t, which is 32-bit on this platform. This saved a few useless uxth instructions for another few cycles. This shouldn't affect other platforms, but I don't know for sure. Could typedef if necessary. * A lot of the speed improvement in blending comes from not using two instructions per and. For example, & 0x8000 -- the compiler preferred to mask out bytes using bic 0x7F00 and bic 0x00FF. Both slower and seemed less correct for what we're trying to do." |
||
---|---|---|
.. | ||
DESCR | ||
distinfo | ||
Makefile | ||
PLIST |