Quantcast
Channel: Raspberry Pi Forums
Viewing all articles
Browse latest Browse all 4398

General • Re: Surprising performance disparity

$
0
0

I then marked them as NOINLINE and looked at the disassembly, but I can't seem to spot what would make one so much slower than the other.
You can see from the disassembly that the 'fast' version has replaced your loops with a call to the optimised memcpy() - which apart from anything else will be writing words at a time giving a 4x speedup from that aspect alone. The other version is still evaluating your loops and doing it pixel-by-pixel.

I can't immediately explain why the compiler makes the optimisation in one case but not the other, but I suspect it is afraid of the pointer aliasing something. If so, declaring draw_buf as

Code:

uint8_t *restrict draw_buf
instead of

Code:

uint8_t *draw_buf
will probably fix it.

Statistics: Posted by arg001 — Sun Jun 30, 2024 9:18 am



Viewing all articles
Browse latest Browse all 4398

Trending Articles