Hello,

I wrote a library for arbitrary precision arithmetic recently and spent a long time implementing efficient multiplication algorithms. Now I am looking to see if it is possible for me to optimize anywhere else. One place that I thought should be more optimizeable is my division algorithm. Right now I use (hiding some internal implementation stuff):

Int divide(Int numerator, Int denominator)const
{
    if (denominator==0)
        return DIVIDE_BY_ZERO_ERROR;
    if (numerator==0)
        return 0;
    if (numerator.sign()!=denominator.sign())//result is negative
        return -((-numerator)/denominator);
    if (numerator==denominator)
        return 1;
    Int num=numerator.abs();
    Int den=denominator.abs();
    Int ret=0;
    while (num>den)
    {
        ret++;
        num-=den;
    }
    return ret;
}

Which is fairly slow, especially since subtraction takes awhile when done on Int's instead of on the individual digits of those Int's. Is there a faster way to get the integer part of the quotient of two numbers? Perhaps one that actually uses their internal digit-array representation?

These are arbitrary-precision integer datatypes, sorry if I didn't make that clear. IE: Not int but Int which is a class that stores integers in an array of smaller integers.

Naming conventions... :-) I would suggest that you use something other than Int for your AP class to avoid this sort of confusion, such as APInt, or AnInt, etc. Depending upon the font, it can be difficult to tell int apart from Int, especially for us old farts! :-)

Yeah, sorry about that. I actually have the class wrapped in a namespace so that ambiguity is avoided. Does that count as a valid naming convention? (IE: LAB::Int)

One minor thing, I would probably move this test:

if (numerator.sign()!=denominator.sign())//result is negative
    return -((-numerator)/denominator);

to the start of the function to avoid a double testing for the 0 conditions. Currently, if you have a negative, non-zero fraction, you first test both num and den for being zero, then, you see that the fraction is negative, make a recursive call in which you will, again, test both num and den for being zero. Simply moving the negativity test to the beginning will solve that minor inefficiency.

But, of course, the major thing is the following loop:

Int ret=0;
while (num>den)
{
    ret++;
    num-=den;
}

This is linear in "ret", meaning that you just repeatedly subtract den from num until you can't do it anymore. This seems terribly inefficient. I would recommend using a method that does it in O(log(N)) time instead of O(N). Here is a simple method to do it:

Int num = numerator.abs();
Int den = denominator.abs();
Int ret = 0;
Int mask = 1;
while( num > den ) {
    den <<= 1;
    mask <<= 1;
};
while( !(mask & 1) )
{
    den >>= 1;
    mask >>= 1;
    if( num > den ) {
        ret |= mask;
        num -= den;
    };
};

At least, this does the work in log(N) where N is the value of "ret" and log is base 2. If you didn't understand this, it is quite simple, you multiply "den" by 2 as many times as possible while remaining under the value of "num". And then, you go back down to the original value of "den" by repeatedly dividing by two (right-shift) and recording a corresponding bit in "ret" for each valid subtraction. This means that the two loops only run for as many powers of 2 (or highest bit) there are in "ret", which is exponentially better than O(N).

I believe that this is pretty much the way it is implemented in actual CPU division modules (except that with fixed-precision, you can unroll everything, instead of looping).

N.B.: If you say "but I haven't got around to implementing bit-wise operators (and, or, shifts, etc.) for my Int class yet...", then I say, "Do it!". Operators like add / subtract / mult / div are all higher-order operations, and you first need to do bit-wise operators, and build up from there.

Comments
ONce again, showing deep knowledge

Excellent response as usual! Of course I have implemented the logical functions, as well as optimized them as far as I possibly could. I was sort of hoping that the response would use them, at least to a certain extent.

That method also has the benefit of often only doing very simple bitwise operations (bitshift by 1 bit [or 8 bits unrolled], check last bit, etc) which can be done extremely quickly on my raw data.

Thank you.

Also, another little note on the first minor issue with the ordering of the conditions. In general, the rule when you have to do a number of checks like this at the start of a function is that you should always start with the checks that are most likely to fail (i.e., the more common occurences), and end with the least likely to fail. This way, you avoid unecessary checks in most cases. Your example is a clear case of that, the likelihood that a fraction is negative is about 50%, which is much more than the likelihood of a divide-by-zero or a zero numerator.

Also note that the check for the numerator being 0 is not strictly necessary, and might be detrimental to the performance, since, in general, a condition check (branch) is more expansive than a trivial degeneration of the algorithm that produces the same answer. However, in this case, I think it is OK to have it because, without it, you end up doing two conditional checks (the while loop conditions) before getting at the value of 0.

Also, I just noticed that the while loop conditions should have a greater-than-or-equal tests:

Int num = numerator.abs();
Int den = denominator.abs();
Int ret = 0;
Int mask = 1;
while( num >= den ) {
    den <<= 1;
    mask <<= 1;
};
while( !(mask & 1) )
{
    den >>= 1;
    mask >>= 1;
    if( num >= den ) {
        ret |= mask;
        num -= den;
    };
};

this is to account, correctly, for an exact division (no remainder).

This question has already been answered. Start a new discussion instead.