Searching \ for '[AVR] divide 48bit number by (constant) 125 ?' in subject line. () Help us get a faster server
FAQ page: massmind.org/techref/method/math.htm?key=divide
Search entire site for: 'divide 48bit number by (constant) 125 ?'.

Exact match. Not showing close matches.
'[AVR] divide 48bit number by (constant) 125 ?'
2009\04\29@031455 by   I want to calculate:

(ticks * <microsecondsPerTick>) / 1000

to return the number of milliseconds elapsed based on a count
maintained by a timer ISR.  microsecondsPerTick is a nice constant
number like 1024 or 2048, so the multiplication becomes a shift, and
factoring out twos I get:

(ticks * (microsecondsPerTick/8)) / 125

"ticks" is at least 40 bits, however, and the multiplication MUST NOT
OVERFLOW (been there, trying to fix that!), so I'm looking at at at
least a 48bit intermediate value.

It would be nice if the code were in C.  Avr gcc, in particular.
There are 8, 16, 32, and even 64 bit math functions available, but of
course they're *C*, so they don't let you actually do mixed-sized
math, nor are they likely to optimized

The code is not particularly time critical, but it would be nice if it
were as small and fast as possible (using 64bit long longs seems
excessive.)

Thoughts?  Any pointers to mixed-size C math functions in general?  In
principle, I think I understand how to do this, but ...  Is there a
way to take advantage of the constant?  Of 125 in particular?  (lots
of one bits there...)

Thanks
Bill W  How much error can you stand?
125 is pretty close to 128.  :-)

-----Original Message-----
{Quote hidden}

>-   >How much error can you stand?
>125 is pretty close to 128.  :-)

Well, from what he is saying, not much error.

But I was wondering - just thinking outside the box here - if it is possible
to change the clock frequency of the micro enough to make the 125 become 128
...  On Wed, 29 Apr 2009, William "Chops" Westfield wrote:

> Thoughts?  Any pointers to mixed-size C math functions in general?  In
> principle, I think I understand how to do this, but ...  Is there a
> way to take advantage of the constant?  Of 125 in particular?  (lots
> of one bits there...)

Are you able to do this with 5 16 bit by 8 bit divisons?

Regards
Sergio Masci   Is the 'microsecondsPerTick' known during the compile time? How long would
it do that with float? (which is 32 bit only so may affects the accuracy).

Tamas

On Wed, Apr 29, 2009 at 8:14 AM, William "Chops" Westfield
<westfw mac.com>wrote:

{Quote hidden}

> -  48 bits by 125

I would use a grade school divide and nybble away at the 48 bit number
something like

unsigned char i48;  // 48 bit number
unsigned char r48;  // Result

unsigned char m,r;     // most , least and result
signed char l;         // signed so some compilers can check better for
negative than > 128

// init the most and least significant numbers
m = i48;
l = i48;

for (char i = 2; i != 5; i++)
{

for (char j = 8; i != 0; i--)  // should be optimizede to a repeat .. until
{
r <<= 1;          // No need to init all the bits will be shifted out
if (m > 250)      // 125 shifted up by 1
{
m -= 250;
r += 1;
}

// shift m:l left
m <<= 1;
if (l < 0) m += 1;
l <<= 1;
}

r48[i-2] = r;
l = i48[i+1];
}

// Run out the last byte

for (char j = 8; i != 0; i--)  // should be optimizede to a repeat .. until
{
r48 <<= 1;          // No need to init all the bits will be shifted out
if (m > 250)      // 125 shifted up by 1
{
m -= 250;
r48 += 1;
}
m <<= 1;
}

This shouldn't generate much code all of the operations are byte sized.
Using 18037 support the shifts and tests should be able to be optimized to
a single 8 bit subtract and three shifts.

The over all result probably is out by a factor of two for the 125 to 250
doubling.

Regards,

--
Walter Banks
Byte Craft Limited
http://www.bytecraft.com   William "Chops" Westfield wrote:
> I want to calculate:
>
>    (ticks * <microsecondsPerTick>) / 1000
>
> to return the number of milliseconds elapsed based on a count
> maintained by a timer ISR.  microsecondsPerTick is a nice constant
> number like 1024 or 2048, so the multiplication becomes a shift, and
> factoring out twos I get:
>
>    (ticks * (microsecondsPerTick/8)) / 125
>
> "ticks" is at least 40 bits, however, and the multiplication MUST NOT
> OVERFLOW (been there, trying to fix that!), so I'm looking at at at
> least a 48bit intermediate value.

Do you have access to the ISR? I would be inclined to do the following,
rather than (or in addition to) simply counting the raw ticks:

/* microseconds can be a 16-bit integer */
microseconds += microsecondsPerTick;
while (microseconds >= 1000) {
microseconds -= 1000;
++milliseconds;
}

-- Dave Tweed   This should give you what you need - you can turn a division into a
sum of several binary divisions:

http://piclist.com/techref/method/math/divconst.htm

There is an online code generator here, though it only goes up to 32
bits, but you probably don't care about the code generated, just the
binary division factors:
http://www.piclist.com/techref/piclist/codegen/constdivmul.htm

1/125 = 0.008, which happens to be very close to:

; ALGORITHM:
; Clear accumulator
; Add input / 128 to accumulator
; Add input / 8192 to accumulator
; Add input / 16384 to accumulator
; Add input / 262144 to accumulator
; Move accumulator to result
;
; Approximated constant: 0.00799942, Error: 0.00724792 %

http://www.piclist.com/cgi-bin/constdivmul.exe?Acc=ACC&Bits=32&endian=little&Const=.008&ConstErr=0.05&Temp=TEMP&cpu=pic16

You can change the "ConstErr=0.05" to a finer value to reduce the error further.

On Wed, Apr 29, 2009 at 3:14 AM, William "Chops" Westfield
<westfw mac.com> wrote:
{Quote hidden}

>   Just in case the below wasn't explicit enough, the info below would be
converted into C as follows:

result = input / 125;  // Original
result = input / 128 + input / 8192 + input / 16384 + input / 262144;
// To binary equivalant ( with 0.007% error)
result = input >> 7 + input >> 13 + input >> 14 + input >> 18; //
Using shifts rather than division (faster, same result)

Slightly faster (since a one bit shift takes a full instruction cycle):
temp = input;
temp = temp >> 7;
result += temp;
temp = temp >> 6; // temp = input >> 13
result += temp;
temp = temp >> 1;  // temp = input >> 14
result += temp;
temp = temp >> 4; // temp = input >> 18
result += temp;

This can be optimized further, but it only requires a total of 18
shifts (whereas the previous version might have actually generated 52
shifts) and a similar number of additions.  Across a 48 bit number on
an 8 bit processor this adds up...

If you decide you only need 1% accuracy, then you only need the first
two factors (128, 8192), and if you want 0.001% accuracy you need 6
factors (128, 8192, 16384, 262144, 2097152, 8388608).

On Wed, Apr 29, 2009 at 10:38 AM, M. Adam Davis <stienman gmail.com> wrote:
{Quote hidden}

>> -  > I want to calculate:
>
>    (ticks * <microsecondsPerTick>) / 1000
>
> to return the number of milliseconds elapsed based on a
> count maintained by a timer ISR.  microsecondsPerTick
> is a nice constant number like 1024 or 2048, so the
> multiplication becomes a shift

So, what you need is just to divide by 1000 after the shifting.
Why not convert it to decimal, discard last 3 digits, and then convert
it back to binary?

> The code is not particularly time critical  why not find out where ticks is incremented, and add another variable
that increments every 1K times ticks gets incremented, and use that
instead of ticks?  Or depending on the accuracy you want, change the
formula to msecs = (ticks>>17)*<microsecondsPerTick>*131;

Tony

Marechiare wrote:
{Quote hidden}   On Apr 29, 2009, at 4:19 AM, Dave Tweed wrote:

> Do you have access to the ISR? I would be inclined to do the
> following,
> rather than (or in addition to) simply counting the raw ticks:
>
>    /* microseconds can be a 16-bit integer */
>    microseconds += microsecondsPerTick;
>    while (microseconds >= 1000) {
>        microseconds -= 1000;
>        ++milliseconds;
>    }

This is in fact the currently proposed solution (plus or minus
optimization), although some would rather the ISR were shorter.

(avr-gcc will "optimize" that while loop to a divide, BTW.  ouch! (and
Grr!)
http://www.avrfreaks.net/index.php?
name=PNphpBB2&file=viewtopic&t=78209 )   That's a very nice explanation.  Exactly these kind of things what a high
level language compiler supposed to do, however. It is actually really sad
if avr-gcc does not support microcontroller specific optimizations and
storage classes.

Tamas

On Wed, Apr 29, 2009 at 3:56 PM, M. Adam Davis <stienman gmail.com> wrote:

{Quote hidden}  ----- Original Message -----
From: "Dave Tweed" <pic dtweed.com>
To: <piclist mit.edu>
Sent: Wednesday, April 29, 2009 7:19 AM
Subject: Re: [AVR] divide 48bit number by (constant) 125 ?

{Quote hidden}

> --   On Apr 29, 2009, at 7:56 AM, M. Adam Davis wrote:

> Slightly faster (since a one bit shift takes a full instruction
> cycle):
> temp = input;
> temp = temp >> 7;
> result += temp;
> temp = temp >> 6; // temp = input >> 13
> result += temp;
> temp = temp >> 1;  // temp = input >> 14
> result += temp;
> temp = temp >> 4; // temp = input >> 18
> result += temp;

This is especially interesting, because after all the original "ticks"
input BEFORE the multiply is already "input>>7"  (it becomes a
multiply by "128/125", which is just fine...)

BillW   On Apr 29, 2009, at 7:56 AM, M. Adam Davis wrote:

> If you decide you only need 1% accuracy, then you only need the first
> two factors (128, 8192), and if you want 0.001% accuracy you need 6
> factors (128, 8192, 16384, 262144, 2097152, 8388608).

To get the accuracy you talk about, do you need to have "perfect"
fractions (1/128 * input) or does truncation suffice? (Rounding?)   I
think the last time I actually tried to implement one of these "sum of
power-of-2 fractions" I ended up getting really lousy results, that I
attributed (rightly or wrongly) to the bits shifted off the right
side...

BillW  > -----Original Message-----
> From: piclist-bounces mit.edu [piclist-bounces mit.edu] On
Behalf
> Of Tamas Rudnai
> Sent: 29 April 2009 19:03
> To: Microcontroller discussion list - Public.
> Subject: Re: [AVR] divide 48bit number by (constant) 125 ?
>
> That's a very nice explanation.  Exactly these kind of things what a
high
> level language compiler supposed to do, however. It is actually really
> if avr-gcc does not support microcontroller specific optimizations and
> storage classes.
>

It does as much as possible, but it can't arbitrarily convert integer
multiplies/divides to shifts and adds unless the multiplier/divider can
be perfectly expressed in a small, finite number of terms.  If not there
will always be some accuracy trade-off against the number of terms used.

Also there is often a danger of impaired accuracy when the operands
magnitudes are small, e.g. in this case an input of 126, 126 or 127 will
give a valid result of 1 when using an integer division.  When using
shifts and additions, the result will be (incorrectly) zero for these
inputs.  This may well be acceptable, but how would a compiler know
this?

IMO this is exactly the kind of optimisation is where human intervention
should be required since the compiler doesn't have enough information to
make a valid decision.

Regards

Mike

=======================================================================
This e-mail is intended for the person it is addressed to only. The
information contained in it may be confidential and/or protected by
law. If you are not the intended recipient of this message, you must
not make any use of this information, or copy or show it to any
received this e-mail, and return the original to us. Any use,
forwarding, printing or copying of this message is strictly prohibited.
No part of this message can be considered a request for goods or
services.
=======================================================================   On Wed, Apr 29, 2009 at 7:06 PM, William "Chops" Westfield
<westfw mac.com> wrote:
>
> On Apr 29, 2009, at 7:56 AM, M. Adam Davis wrote:
>
>> If you decide you only need 1% accuracy, then you only need the first
>> two factors (128, 8192), and if you want 0.001% accuracy you need 6
>> factors (128, 8192, 16384, 262144, 2097152, 8388608).
>
> To get the accuracy you talk about, do you need to have "perfect"
> fractions (1/128 * input) or does truncation suffice? (Rounding?)   I
> think the last time I actually tried to implement one of these "sum of
> power-of-2 fractions" I ended up getting really lousy results, that I
> attributed (rightly or wrongly) to the bits shifted off the right
> side...

That's correct, this is still subject to rounding error.

If you want to include *all* the bits in the calculation, you reverse
the process:
Original right  shifts: 7, 13, 14, 18
Reverse left shifts: 11, 5, 4, 0 then shift right by 18

This takes a  maximum of 2x the number of bits in the original - in
other words, it could take up to 48+48 bits to process a 48 bit
number, if you wanted to get as much accuracy as possible.

output = (input << 11 + input << 5 + input << 4 + input) >> 18;

I don't know if the code generator mentioned actually computes the
accuracy based on all the rounding errors, or just the error from the
"ideal" equation where the intermediate values are computed using
floating point (ie, takes all bits into account).  But it must not,
because the rounding errors can be truly nasty:

Take 12500/125, for instance.  The ideal result is 100.  The binary
division approximate, using floating point rather than integers, is
99.99275. Using all the bits (as in the left shift then right shift
above) you get 99 - which is the integer equivalent of the floating
point version of the binary division.

This is 1% off.

The left shift version, however, results in 98, with 4 factors (128,
8192, 16384, 262144).  The right shift version, by preserving all the
bits, gets at least as close as one digit (and that can be improved by
rounding - look at the 18'th bit before it's shifted off and add 1 to
the result if it's high).  The original right shift version can be off
by more than one integer.  One integer may be a lot or a little,
depending on your range and usage.

If you use this repeatedly, as in infinite impules response filters,
you're going to have bad problems.

Ideally you'd use all the factors down to the 48th bit:
7, 13, 14, 18, 21, 24, 25, 27, 28, 29, 31, 34, 36, 37, 38, 39, 43, 44,
46 (left shifts)

But even then, you're using integer intemediate values, so you'd still
have rounding errors, just not as significant.

However, it's still faster than 'real' division, and yields good
results for most cases.