[Icc-avr] Optimization of write and read to register STS, LDS
Michael Dipperstein
MDipperstein at CalAmp.com
Mon Oct 1 17:14:30 PDT 2007
Bengt,
You can get the compiler to use the Z register if you use a local
register pointer + offset instead of the register name directly. You
just need to make sure the offset is never larger than 63.
Here's what I came up with using ATmega48 and version 7.13A with no
optimization:
void OsccalRetrocal(void)
{
unsigned char temp;
unsigned char val_1 = 0x01;
volatile unsigned char *reg;
// set the CPU Frequency to 1MHz (8Mhz / 8 = 1Mhz)
reg = &CLKPR;
*reg = (1<<CLKPCE);
*reg = ((1<<CLKPS1)|(1<<CLKPS0));
// setup timer1 TC1= 1MHz = 1usec
*(reg + (&TIMSK1 - &CLKPR)) = 0;
*(reg + (&TCCR1B - &CLKPR)) = val_1;
// setup timer2
*(reg + (&TIMSK2 - &CLKPR)) = 0;
/* reload reg with an address within 64 bytes of remaining registers
*/
reg = &TCCR2A;
// set 32,768kHz osc as source for timer2
*(reg + (&ASSR - &TCCR2A)) = (1<<AS2);
// 200 / 32768Hz ~= 6103uS
*(reg + (&OCR2A - &TCCR2A)) = 200;
// TC2 = 32768Hz ~= 30.5176uS
*reg = val_1;
temp = OSCCAL;
}
Here's what the compiler gave me:
.dbfunc e OsccalRetrocal _OsccalRetrocal fV
; temp -> R16
; val_1 -> R16
; reg -> R18,R19
.even
_OsccalRetrocal::
.dbline -1
.dbline 242
; }
;
; void OsccalRetrocal(void)
; {
.dbline 244
; unsigned char temp;
; unsigned char val_1 = 0x01; ldi R16,1
.dbline 248
; volatile unsigned char *reg;
;
; // set the CPU Frequency to 1MHz (8Mhz / 8 = 1Mhz)
; reg = &CLKPR;
ldi R18,97
ldi R19,0
.dbline 249
; *reg = (1<<CLKPCE);
ldi R24,128
movw R30,R18
std z+0,R24
.dbline 250
; *reg = ((1<<CLKPS1)|(1<<CLKPS0));
ldi R24,3
std z+0,R24
.dbline 253
;
; // setup timer1 TC1= 1MHz = 1usec
; *(reg + (&TIMSK1 - &CLKPR)) = 0;
clr R2
std z+14,R2
.dbline 254
; *(reg + (&TCCR1B - &CLKPR)) = val_1;
std z+32,R16
.dbline 257
;
; // setup timer2
; *(reg + (&TIMSK2 - &CLKPR)) = 0;
std z+15,R2
.dbline 260
;
; /* reload reg with an address within 64 bytes of remaining
registers */
; reg = &TCCR2A;
ldi R18,176
.dbline 263
;
; // set 32,768kHz osc as source for timer2
; *(reg + (&ASSR - &TCCR2A)) = (1<<AS2);
ldi R24,32
movw R30,R18
std z+6,R24
.dbline 266
;
; // 200 / 32768Hz ~= 6103uS
; *(reg + (&OCR2A - &TCCR2A)) = 200;
ldi R24,200
std z+3,R24
.dbline 269
;
; // TC2 = 32768Hz ~= 30.5176uS
; *reg = val_1;
std z+0,R16
.dbline 271
;
; temp = OSCCAL;
lds R16,102
.dbline -2
L51:
.dbline 0 ; func end
ret
You can get a little better in assembly by using the Z register
directly, instead of using R18/R19.
-Mike
-----Original Message-----
From: icc-avr-bounces at imagecraft.com
[mailto:icc-avr-bounces at imagecraft.com] On Behalf Of Bengt Ragnemalm
Sent: Monday, October 01, 2007 11:35 AM
To: Discussion list for ICCAVR and ICCtiny Users. You do NOT need
tosubscribe to icc-announce if you are a member of this.
Subject: Re: [Icc-avr] Optimization of write and read to register STS,
LDS
Richard, as this is a post for optimization I think this is most for
you.
I post the code here so I can describe how this optimization works. What
I
ask is if this type of optimization is or can be done by the compiler.
When
there is always the question of how much effect it will have.
-----
Assembler snippet:
RETRO_CAL: STS CLKPR,V128 ;DROP CLOCK TO 1Mhz
STS CLKPR,THREE ;
STS TIMSK1,ZERO ;SETUP TIMER1
STS TCCR1B,ONE ;TC1 = 1MHz = 1uS
STS TIMSK2,ZERO ;SETUP TIMER2
STS ASSR,EIGHT ;CLOCK FROM 32KHz OSCILLATOR
LDI TMP1,200 ;OCR2A = 200
STS OCR2A,TMP1 ;200 / 32768Hz ~= 6103uS
STS TCCR2A,ONE ;TC2 = 32768Hz ~= 30.5176uS
LDS TMP,OSCCAL ;READ OSCCAL
----
Same snippet in C:
void OSCCAL_retrocal(void)
{
unsigned char temp;
unsigned char val_1 = 0x01;
unsigned char low,high;
// set the CPU Frequency to 1MHz (8Mhz / 8 = 1Mhz)
CLKPR = (1<<CLKPCE);
CLKPR = (1<<CLKPS1)|(1<<CLKPS0);
// setup timer1 TC1= 1MHz = 1usec
TIMSK1 = 0;
TCCR1B = val_1;
// setup timer2
TIMSK2 = 0;
// set 32,768kHz osc as source for timer2
ASSR = (1<<AS2);
// 200 / 32768Hz ~= 6103uS
OCR2A = 200;
// TC2 = 32768Hz ~= 30.5176uS
TCCR2A = val_1;
temp = OSCCAL;
----
You can see that there are several registers acesses that is done with
STS
or LDS instructions. As there are so many of them someone suggested that
they were replaced with STD and LDD instructions. To use displacement I
understand that the Z register (?) must be used. Also the loading of the
Z
register will cost some bytes extra. Maybe the Z register is already in
use
and in that case this possibility can only occur in very special moment
making it not worth the effort.
But I like to stretch C and the compiler to its limits to see how far
you
can go before you have to switch to inline assembler.
Also note the lokal variable val_1 that is used to set the value 1
several
times.
Regards
Bengt
-------------------
> I am testing a size optimized osccal routine (RetroDan, AVRfreaks). It
is
a
> hint there about the possibility to optimize many STS and LDS
instructions
> with LDD and STD (load and store with displacement).
>
> It occured to me that why could a C-compiler not catch that
possibility?
Or
> maybe it would be possible to "fool" the compiler to use it anyway.
>
> Regards,
> Bengt
>
> _______________________________________________
> Icc-avr mailing list
> Icc-avr at imagecraft.com
> http://dragonsgate.net/mailman/listinfo/icc-avr
>
_______________________________________________
Icc-avr mailing list
Icc-avr at imagecraft.com
http://dragonsgate.net/mailman/listinfo/icc-avr
More information about the Icc-avr
mailing list