[Icc-mot] Interesting Optimizations

Gene Norris genenorris at spotengineering.com
Fri Apr 4 09:20:28 PST 2008


Hello Edward,

Edward Karpicz wrote:
> Hi Gene
> 
>> ; unsigned crc_calculation(unsigned irrelevant, unsigned useless)
>>  1662           ; {
>>  1662                   .dbline 1231
>>  1662           ;   unsigned i, j, k;
>>  1662           ;
>>
>>  1662           ;   i = j = 0x0011;
>>  1662 1800820011        movw #17,2,S
>>  1667 1800840011        movw #17,4,S
>>  166C                   .dbline 1232
>>
>>  166C           ;   i = i << 8;
>>  166C EC84              ldd 4,S             // D = 0x0011
> 
> All right, except it's not very optimal. If compiler wasn't using MOVW 
> initializing i and j, but load-store instructions, then it could save on 
> loading i into register.
> 
>>  166E B710              tfr B,A             // D = 0x1111
>>  1670 C7                clrb                // D = 0x1100
>>  1671 B746              tfr D,Y             // Worked!
>>  1673 6D84              sty 4,S             // Back in original location
>>  1675                   .dbline 1233
> 
> All right, except that it isn't necessary to move D to Y first, then 
> store Y to i. But generated code is correct.
Yes, I didn't want to muddy the waters though.

> 
> 
>>
>>  1675           ;   i = (i << 8) + 1;
>>  1675 A684              ldaa 4,S            // D = 0x1100 (from above)
>>  1677 C601              ldab #1             // D = 0x1101 (?!)
>>  1679 B746              tfr D,Y             // Fast? (but wrong)
>>  167B 6D84              sty 4,S             // Back to original spot
>>  167D                   .dbline 1234
> 
> Why wrong? It's OK. As seen from i = i << 8 case, i is stack local var 
> at 4,S. Since A is upper half of D register, we can optimize regD<<8 by 
> loading A with highbyte(i), right? Since shift left zeroes least bits, 
> we can replace +1 with load B with #1. Try adding something bigger than 
> 255 and compiler will generate different code.
It is loading the highbyte of D with the high byte of j, it should be
loading the highbyte of D with the low byte of j [That is (j << 8)].
HC12s are big-endian, I'm not sure where you are confused.

i = 0x1100
i = (i << 8) + 1;

Result is 0x1101, result should be 0x0001

> 
> 
>>
>>  167D           ;   j = (j << 8) | 2;
>>  167D A682              ldaa 2,S            // D = 0x0001, ldaa 3,s?
> 
> It's loading hibyte of D, that's why 2,s. It optimizes <<8 by loading 
> right bits to right 8bit register.
It is loading the highbyte of D with the high byte of j, it should be
loading the highbyte of D with the low byte of j [That is (j << 8)]. It
optimizes incorrectly. No 8 bit shift occurs in either example. Only the
lowbyte is correct.

> 
>>  167F C602              ldab #2             // D = 0x0002, oops
>>  1681 B746              tfr D,Y             // Faster, but curiously,
>>  1683 6C82              std 2,S             // drastically incorrect
>>  1685                   .dbline -2
>>
> 
> Why incorrect?
j = 0x0011
j = (j << 8) | 2;

Result is 0x0002, result should be 0x1102
> 
> Regards
> Edward
> 
> _______________________________________________
> Icc-mot mailing list
> Icc-mot at imagecraft.com
> http://dragonsgate.net/mailman/listinfo/icc-mot
> 
> 
> 



More information about the Icc-mot mailing list