[Icc-mot] Interesting Optimizations
Edward Karpicz
ekarpicz at freemail.lt
Fri Apr 4 08:42:07 PST 2008
Hi Gene
> ; unsigned crc_calculation(unsigned irrelevant, unsigned useless)
> 1662 ; {
> 1662 .dbline 1231
> 1662 ; unsigned i, j, k;
> 1662 ;
>
> 1662 ; i = j = 0x0011;
> 1662 1800820011 movw #17,2,S
> 1667 1800840011 movw #17,4,S
> 166C .dbline 1232
>
> 166C ; i = i << 8;
> 166C EC84 ldd 4,S // D = 0x0011
All right, except it's not very optimal. If compiler wasn't using MOVW
initializing i and j, but load-store instructions, then it could save on
loading i into register.
> 166E B710 tfr B,A // D = 0x1111
> 1670 C7 clrb // D = 0x1100
> 1671 B746 tfr D,Y // Worked!
> 1673 6D84 sty 4,S // Back in original location
> 1675 .dbline 1233
All right, except that it isn't necessary to move D to Y first, then store Y
to i. But generated code is correct.
>
> 1675 ; i = (i << 8) + 1;
> 1675 A684 ldaa 4,S // D = 0x1100 (from above)
> 1677 C601 ldab #1 // D = 0x1101 (?!)
> 1679 B746 tfr D,Y // Fast? (but wrong)
> 167B 6D84 sty 4,S // Back to original spot
> 167D .dbline 1234
Why wrong? It's OK. As seen from i = i << 8 case, i is stack local var at
4,S. Since A is upper half of D register, we can optimize regD<<8 by loading
A with highbyte(i), right? Since shift left zeroes least bits, we can
replace +1 with load B with #1. Try adding something bigger than 255 and
compiler will generate different code.
>
> 167D ; j = (j << 8) | 2;
> 167D A682 ldaa 2,S // D = 0x0001, ldaa 3,s?
It's loading hibyte of D, that's why 2,s. It optimizes <<8 by loading right
bits to right 8bit register.
> 167F C602 ldab #2 // D = 0x0002, oops
> 1681 B746 tfr D,Y // Faster, but curiously,
> 1683 6C82 std 2,S // drastically incorrect
> 1685 .dbline -2
>
Why incorrect?
Regards
Edward
More information about the Icc-mot
mailing list