I am trying to implement my own version of BigInteger which can hold an arbitrary big integer. I have posted my BigInteger.h file here and I need your suggestions if there is any better design for this.

1. std::string private member will hold the BigInteger in the form of string.

2. Addition :
a. take each character from right
b. convert it to int.
d. Add both. hold the sum in one variable and carry over in another.
e. convert sum to string and append to the result string.
f. carry over variable to be used in the next set of addition.
g. repeat till the left most end of the string.


3. This is very much a preliminary version. In the future, I also plan to overload "=" operator so that I can assign values directly without using constructors.

I'm a bit skeptical about the 1st point. Is there any better way to implement this other than using strings? Also 2.b and 2.e. Is there any other solution other than converting a string to int, adding and then converting back to string?

#include <iostream>
#include <string>
#include <sstream>

class BigInteger
{
    private:
        ////////////////////////////////////////////////////////////////////////
        // The big integer will be stored as a string
        // in this private member.
        ////////////////////////////////////////////////////////////////////////
        std::string _number;

    public:
        ////////////////////////////////////////////////////////////////////////
        // Default constructor to initialize _number.
        // Sets the value of _number to "0".
        ////////////////////////////////////////////////////////////////////////
        //  Parameters  :    none
        ////////////////////////////////////////////////////////////////////////
        BigInteger():_number("0"){}

        ////////////////////////////////////////////////////////////////////////
        // Constructor to initialize _number.
        // The integer which represents the big integer
        // to be passed as argument. Integer will be converted
        // to string and stored in _number.
        ////////////////////////////////////////////////////////////////////////
        //  Parameters  :    integer
        ////////////////////////////////////////////////////////////////////////
        BigInteger(int integer)
        {
            std::stringstream stream;
            stream << integer;
            _number = stream.str();
        }

        ////////////////////////////////////////////////////////////////////////
        // Constructor to initialize _number.
        // The string which represents the integer
        // to be passed as argument.
        ////////////////////////////////////////////////////////////////////////
        //  Parameters  :    std::string which represents the integer
        ////////////////////////////////////////////////////////////////////////
        BigInteger(std::string str):_number(str){}

        ////////////////////////////////////////////////////////////////////////
        // Prints the number to console
        ////////////////////////////////////////////////////////////////////////
        void print(){std::cout<<_number<<std::endl;}

        ////////////////////////////////////////////////////////////////////////
        // Overloads "+" operator for addition
        ////////////////////////////////////////////////////////////////////////
        //  Returns     :    BigInteger object after addition
        //  Parameters  :    BigInteger object r_value (explicit)
        //                   BigInteger object l_value (implicit)
        ////////////////////////////////////////////////////////////////////////
        BigInteger operator + (const BigInteger &r_val);

        ////////////////////////////////////////////////////////////////////////
        // Overloads "-" operator for subtraction
        ////////////////////////////////////////////////////////////////////////
        //  Returns     :    BigInteger object after subtraction
        //  Parameters  :    BigInteger object r_value (explicit)
        //                   BigInteger object l_value (implicit)
        ////////////////////////////////////////////////////////////////////////
        BigInteger operator - (const BigInteger &r_val);

        ////////////////////////////////////////////////////////////////////////
        // Overloads "*" operator for multiplication
        ////////////////////////////////////////////////////////////////////////
        //  Returns     :    BigInteger object after multiplication
        //  Parameters  :    BigInteger object r_value (explicit)
        //                   BigInteger object l_value (implicit)
        ////////////////////////////////////////////////////////////////////////
        BigInteger operator * (const BigInteger &r_val);

        ////////////////////////////////////////////////////////////////////////
        // Overloads "/" operator for division
        ////////////////////////////////////////////////////////////////////////
        //  Returns     :    BigInteger object after division
        //  Parameters  :    BigInteger object r_value (explicit)
        //                   BigInteger object l_value (implicit)
        ////////////////////////////////////////////////////////////////////////
        BigInteger operator / (const BigInteger &r_val);

        ////////////////////////////////////////////////////////////////////////
        // Overloads "%" operator for modulo
        ////////////////////////////////////////////////////////////////////////
        //  Returns     :    BigInteger object after modulo operation
        //  Parameters  :    BigInteger object r_value (explicit)
        //                   BigInteger object l_value (implicit)
        ////////////////////////////////////////////////////////////////////////
        BigInteger operator % (const BigInteger &r_val);
};

To my mind, I can't see any reason to use a string as storage. As you mentioned, in order to use it at all, you have to constantly convert it back and forth. A vector of unsigned ints and a bool for the sign seems better. Just think of it as doing math in base 4294967296. Each unsigned int is a "digit" ranging from 0 to 4294967295, then do math the normal way. Actually to make things more obvious, use uint32_t instead of unsigned int. For the math, convert from a uint32_t to a uint64_t, then do the math, then convert back when you are done. Making things unsigned saves you the headache of adding 2 billion to 2 billion, then having it overflow into a negative number and checking. Do everything in hex. READ the numbers in as a string in hex. It allows to easily split a very long number into the appropriate 32-bit "digits" simply by looking at the the string length. You STORE everything as a vector of unsigned integers and you do all the arithmetic using uint32_t and uint64_t (verify of course that you HAVE these types. I'm not 100% positive that you are guaranteed to have them. Perhaps you are, but verify. If for some reason you don't, use uint16_t and uint32_t instead. You'll have those for sure in stdint.h).


FFDD345AAFDCC218930240366331 and EEEE333112AADCFEB72218

Split into unsigned ints and store as vectors...
a = {0x40366331, 0xC2189302, 0x345AAFDC, 0x0000FFDD}
b = {0xFEB72218, 0x3112AADC, 0x00EEEE33}

You now have a little-endian representation where the vector index is the exponent and the base is 4294967296. You now add, subtract, multiply as normal. You typecast to uint64_t for individual arithmetic operations so as to allow the computer to handle all that carrying and overflow for you.

uint_64 digit = a[0] + b[0];
sum[0] = (uint32_t) (digit & 0xFFFFFFFF); // least significant 32 bits
uint32_t carry_digit = (uint32_t) (digit >> 8);

Rinse and repeat, just like you did in elementary school except you're using a bigger base. Works for multiplication too!

Edited 4 Years Ago by VernonDozier: n/a

A vector of unsigned ints and a bool for the sign seems better.

Yes, I agree!!

Do everything in hex.

I agree on this too as this makes splitting easier.

READ the numbers in as a string in hex.

Actually I intend to read numbers as a string in decimal (not hex). I will convert it to hex string (using stringstream) for splitting purpose. But I find an issue here. If the number in the string is greater than the integer limits, conversion fails. I am finding a way to solve this. If you have any ideas, please come forward.

I'm having a little trouble visualizing the user input aspect and how it might releate to stringstreams. What does the user input for 5 trillion?

  1. 5000000000000 (no delimiter)
  2. 5 000 000 000 000 (space delimiter)
  3. 5,000,000,000,000 (comma delimiter)
  4. Something else

Choice #1,
no delimiters. Because you can't expect the user to be too smart to put delimiters or to enter hex. As one of my colleagues say, we have to write a code keeping in mind that your end user is the dumbest in the world :P

So I guess the task is to turn 5000000000000 into {0x27395000, 0x0000048C}. Not nearly as easy as hex input, but certainly doable. Why not turn the string into this...

{0, 50000} where the base is 1 billion rather than 4294967296 and the rest of it as described above, just with (temporarily) a different base. I still think you should then convert to base 4294967296. To that end, you'll need to do some base-changing math operations. Therefore I suggest that you write the mathematical FIRST, then handle the user input code afterwards, which is backwards from the general way of doing things. The reason is that since you need to write the carry and borrow and add and subtract and multiply and divide (and possibly mod) functions anyway. Once you have all that, implementing base changing functionality will be much easier.

As far as the string to vector of digits functionality, I picked base 1 billion because...

  1. The digits will be 9 characters long, so breaking the string is easy, just like in hex.
  2. It fits into a uint32_t.
  3. It's as close as you can get to base 4294967296 when meeting criteria 1 and 2 above.

So I would design the class members as containing these data members.

  1. vector <uint32_t> digits;
  2. bool sign;
  3. uint64_t base;

Add some functionality to convert any base into any other base. You may not have intended to have this functionality, but once you have it, your program can implement these "helper functions" so that they no longer only "help". You can use them as part of your base functionality.

Anyway, that would be my approach.

Comments
thanks, this looks great!!!

Here's my approach:
Read the value into a string (StrNum for example).
Create an byte array StrNum.length in length (ByteNum).
Convert each character in StrNum to a byte in BytNum -- ByteNum[x] = StrNum[x] - '0'; Now you can use ByteNum to do all the math.

This way you don't have to deal with any size issues, non-standard type definitions, et al.

So I guess the task is to turn 5000000000000 into {0x27395000, 0x0000048C}.

Yes, exactly and thank you once again for another great suggestion. This much info is enough for me to proceed. I will get back after I finish. Don't know when, b'cos this is something like a hobby when I do only during free time :)

BytNum -- ByteNum[x] = StrNum[x] - '0';

I didn't get what you are trying to do here. Could you please explain.

And I agree with the fact that this solution will do addition and subtraction with ease. But what about multiplication? Am I supposed to take each byte by byte and multiply just like manual multiplication?

BytNum -- ByteNum[x] = StrNum[x] - '0';

I didn't get what you are trying to do here. Could you please explain.

Converts the character value to a binary value. Look at an ASCII chart and study all the values in the code -- in binary.

And I agree with the fact that this solution will do addition and subtraction with ease. But what about multiplication? Am I supposed to take each byte by byte and multiply just like manual multiplication?

Yes.

I have a couple of things to say regarding the base choice. In general, you'll want your base to be as big as possible, because this implies faster operations. However, there are other factors you should also consider, such as potential overflow during operations and conversion time from and to other bases.

If you want your bignum class to be as fast as possible, you can choose your base to be sqrt(4294967296) = 65536 (not 4294967296, because then there is the possibility of overflow during multiplication/addition). However, if you do it like this, input and output in dec are not trivial matters.

If you want to save yourself the base conversion stuff and still have some decent speed, you can choose your base to be 10000 (not 1000000000, because of the multiplication/addition overflow potential). 10000 and 65536 are of almost the same order, so there's no considerable performance loss. Plus, input and output in dec are piece of cake. I'd go with that.

Walt's approach (base 10) is probably the simplest/easiest way to go, but you wouldn't want to use it for serious bignum crunching, as it's quite slow (but then again, if you want serious bignum crunching, you'd be better off using GMP or something like that).

this is something like a hobby when I do only during free time

I trust that you're telling the truth, so here's something to get you started -> http://ideone.com/2aBA1 Note that there's no support for negative numbers (a - b assumes a >= b), division or modulo operation. Also, if you find any bugs, let me know.

Edited 4 Years Ago by m4ster_r0shi: n/a

Sorry. Forget the 65536 vs 4294967296 and 10000 vs 1000000000 stuff. I forgot that 64 bit integers were standardized...

I think we're actually speaking the same language. The base storage needs to be the square root of whatever units you are using. I thought everyone had uint64_t (a little research suggests that's not the case and I've just been lucky), so I figured do the arithmetic using that. The base could be 4294967296 since the largest number in that would be 4294967295 and 4294967295 times 4294967295 would fit in a uint64_t, so no overflow on the arithmetic. If you don't want to use uint64_t, you can use uint32_t for the arithmetic and use 65536 as the base.

So I think we're speaking the same language here regarding overflow. Perhaps not.

>> Sorry. Forget the 65536 vs 4294967296 and 10000 vs 1000000000 stuff. I forgot that 64 bit integers were standardized...

I've been googling this. Mostly getting forum hits and they're mostly saying it's NOT standard in the old standard, but is in the 0x standard. I'm looking for the official word on that.

This article has been dead for over six months. Start a new discussion instead.