Hello everyone,

I am posting this more generically in the software development section to see what might be available and to increase a possible success rate. But enough of that let's get to the question at hand.

Recently, in C#, I started back up a project of mine to do RSA encryption. Part of what makes an RSA cipher successful is the production of large, random numbers, that are prime. For my project, I am using a library that uses random.org to seed a random number generator, and then producing VERY large values that must be stored in BigIntegers (for those who are unfamiliar with these, they pretty much, at least from what I can tell, can hold a number of what ever size you want ... I think they emulate the value from like a byte array).

Anyway, as you saw I said these are "Prime" random numbers. While I have the formula that can create the keys fine, the problem that kills it is it takes me forever to determine if these large random numbers are prime.

So I ask this question, what can I do to speed up the code to produce large random numbers that I know are prime, or what examples do you have? I have done the usual for loop and performing a modulo to see if the remainder is 0, but with numbers this large it's really slow (and by that I think I once forgot it was running and had it going and like 30-60 mins later, the first random number still wasn't found). I am not looking for a C# specific answer, more like pseudo code you might say, or processes

(On a side note, yes I know C# does have a built in RSA library but I hate it. The p and q values produced are too small, and I have no clue how its random engine works, or what it uses. I also hate how it expects the keys, with all the what should be concealed values being flaunted in the object).

I did just that in the deep, dark past when I was studying RSA and public key encryption in general. That was back in the mid-late 1980's. I wrote a program that first seeded an array of 10,000 numbers using the Sieve of Aristostenes. Then, used recursion to determine if any number is prime. Of course, you first eliminate those that have an even digit at the end! It is only the odd ones (so to speak) that are of interest. Unfortunately, the code for that is on a floppy disc in a box somewhere in my storage room... I'd have to reconstruct it, but you might be able to find something on the net that does just that. On an old 8086 system, I could determine if a number up to 16 digits was prime in microseconds. I used that routine to factorize large numbers into primes and to generate Goedel numbers from strings.

All that aside, the really big 1024-2048 digit numbers used for public key encryption these days requires the use of an unlimited precision math library. The boost library has such. Also, you need two of these mongo numbers, one for the public key and the other for the private one.

As a final aside, I was about to build my own unlimited precision math library at that time, but I got a job as director of R&D for a software firm in Syracuse, NY where my wife was doing a post-doc in physics. That kind of put the kibosh on that project! :-)

Hmm, that even number at the end, I can't believe I didn't think of that. I could do that to speed it up.

By the way, what exactly do you mean an "unlimited precision math library"

(Oh and yeah I know I need two large numbers, I can do the RSA encryption by hand, just not with large ps and qs)

Most Intel-based chip sets support 64 bit math (80 bits internally to reducing rounding errors) which gives you up to 16 quadrillion in total values (16,000,000,000,000,000,000) - something less than 20 digits. For larger numbers, you need a library that supports an infinite number of digits - IE, if you need 2000 digits, that will handle it. Doing math on such large number is much slower than the internal math capabilities of the CPU, but still workable. There are dedicated chips (FPGAs mostly) that are designed for computing these numbers in hardware, but you won't find them on most systems as they are dedicated to only one thing - computing public/private keys and supporting the encryption/decryption that RSA and such require.

Since decryption of large messages with RSA and such is very slow, most systems only use the public key encryption to encrypt a smaller symmetric key and then use that to encrypt/decrypt the actual message. IE, the 1024-2048 digit public key is used to encrypt a much smaller (faster) key that is passed to the user who will use their private key to decrypt, and then use that key to decrypt the message.

A great book to read on this subject is Bruce Schneier's "Applied Cryptography" - considered to be a "bible" of the field.

Oh I see what you mean, yeah that would make sense for dedicated chips. Also isn't RSA kind of limited on the size of data it can handle? The one thing I have learned is usually you use RSA and AES in tandem with each other, using RSA to encrypt/decrypt the AES Key and IV.

Also I have a rather good book for cryptology, I just can't remember the name of it at the monent. It has a lot of notes of course written in it making it more valuable to me (one of the last classes I took for my degree was a graduate level Applied Cryptology class ... do miss that class, really wish we went over Quantum Cryptography though)

I know there's a Paralel.For in C# and probably some same kind of construct in other languages. Perhaps worth to give it a look..

How large is your prime number? The fastest way (if not the fastest) to determine if N is a prime number is to have a pre-calculated of all prime number below sqrt(N). Then, it takes only O(sqrt(N)/ln(sqrt(N))) to determine the prime by mod with all prime below sqrt(N).

For example, if you have 10 digits number, it will take at most 10k mod operation which is pretty fast.

commented: Pretty much what I said, just clearer! :-) +13

@ddanbe, that Parallel.For loop looks like a rather interesting thing to investigate. I have had thoughts about using threads, but couldn't really come up with a good way to split up the number. This, however, might offer a solution to the idea. I'll have to research into these more

@invisal, that will probably be out of the option. The values I am producing can, and do exceed the limits of even a double in C#. I can't recall off the top off my head, but I want to say in some tests I was producing ones over 30 digits in size. I wanted to produce crazy high random numbers to help make the encryption algorithm operate more effectively.

What I have been wondering about, is there some way I can do something like a binary search tree concept. Like take the square of a value, or split it somehow, and have lower branches repeat, creating a recurssive function that tests multiple elements at the same time, but in smaller values. I am pretty sure there's not some simple math behind it, but then again, encryption isn't what it is because of simple math.

Do NOT try to implement encryption algorithms yourself. You'll fail and your encryption will be dead easy to break by every script kiddie out there.

You simply don't know enough to replicate or improve on the work of the generations of specialists who've done the actual work. If you did you'd not need to ask the questions you're asking, it's that simple.

That's not to say it's not a nice theoretical exercise to want to create an implementation just to see how things work, but if that were your goal you'd not care about performance :)

@jwenting, I beg to differ.

There is a saying about ciphers, "Make the algorithms public, but the keys private". This has been the case for every cipher in history (with the exception of the Caesar Cipher, which is one of the oldest if not oldest cipher). That being said, and RSA encryption is rather easy to implement.

The reason I am not reusing the .NET version of the RSA cipher is because it's not a very good implementation. Specifically the generation of the keys I believe sucks. RSA depends on the p and q value, values that must be very large, truly random numbers, and the .NET version only produces these values about 15 numbers long (that's not large enough). This is why I am doing it myself.

The reason why is because the P and Q values in an RSA encryption are very important. If even one of these values is figured out by a user, the whole entire encryption algorithm will be compromised, and new keys will need to be generated.

There is a research paper out there called "Mining your Ps and Qs" in which the author finds horrible flaws in company’s entropy sources for their devices. Entropy, as you may know, is "randomness", which can be used for instance to seed random number generators. If this source isn't truly random, and risks even the slightest pattern repeating, it can compromise the system. I would strong suggest finding and reading this paper because it points out a lot of devices out there fail here, where they only using items like clocks to seed the randomness (which they were able to then use a second device, set the same time, and replicate the random numbers generated). These can include items like Routers, Smart TVs, and other devices. In fact, Play Station was hit hard with this once (I think it was the 3rd) where there entropy source wasn't good enough, and because of that people hacked the system, and could do thinks like running pirated games and what not (it was so bad the only way it could be fixe was a hardware update, not software).

This is why I am rebuilding the application. I do not trust the randomness of the built in .NET library, and do not like the size of the keys. I want to make a more secure library then what they provide. This is why I use the library RandomOps, which the user has built in a feature to allow for me to seed the CMWC random number generate with values from Random.org (who obtains their random values, or entropy source, from a collection of radio frequency receivers).

As I have mentioned earlier in my post, I can do the math by hand with smaller random numbers (had to for a class of mine), so I know the formulas very well that are needed, and they aren't super complicated.

The reason I posted this question is because I was looking for a way to make this library practical. This is why I asked the question, because I have a feeling there might be a more effective way to determine if a number is prime then the way I am doing it (programmers are always learning and improving their skills, this is no exception)

You can use Fermat Test. It has high probability of getting the right answer. We know that if a^(n-1) mod n != 1, then n must be a composite number.

def ModPowerRecursive(a, p, m):
    if p == 0:
        return 1
    elif p % 2 == 0:
        return (ModPowerRecursive(a, p >> 1, m) ** 2) % m
        return (a * ModPowerRecursive(a, p - 1, m)) % m

def FermatPrimeTest(n):
    if n == 1:
        return False
    if n == 2:
        return True
    if n % 2 == 0:
        return False

return ModPowerRecursive(2, n - 1, n) == 1

Testing 100+ digits prime

 # return True (in less than second)

Noted again that, this algorithm does not guarantee to give the right answer, but the chance of getting the wrong answer is very slim. For number below 100,000, there is 0.078% chance of getting wrong answer. For number below 1,000,000, there is 0.024% chance of getting wrong answer. As number grow, the chance is getting slimmer and slimmer.

So after recovering from being sick, I have finally been able to try some of these approaches.

First of all, Parallel.For doesn't support a BigInteger variable type. However, I did find one work around to this, but the speed was still rather brutally slow

I then tried the Fermat, and this does seem to be MUCH faster. Istill will need to test it with larger numbers. However, seeing that there is a chance it can fail, I will need to continue to research ways around this. (or a fall back once it produces what it believes to be a valid prime) .

Unfortnatly there is some flaw in some logic in the key generation of my code (not sure what, need more tests)

However, seeing that there is a chance it can fail, I will need to continue to research ways around this.

To be clear, there is no fast way to determine if a very large random number is a prime. Let me quote from reputable source.

  • The wrong way to find primes is to generate random numbers and then try to factor them. The right way is to generate random numbers and test if they are prime. There are several probabilistic primality tests; tests that determine whether a number is prime with a given degree of confidence. Assuming this “degree of confidence” is large enough, these sorts of tests are good enough. I’ve heard primes generated in this manner called “industrial-grade primes”: These are numbers that are probably prime with a controllably small chance of error.

  • Assume a test is set to fail once in 250 tries. This means that there is a 1 in 1015 chance that the test falsely indicates that a composite number is prime. (The test will never falsely indicate that a prime number is composite.) If for some reason you need more confidence that the number is prime, you can set the failure level even lower. On the other hand, if you consider that the odds of the number being composite are 300 million times less than the odds of winning top prize in a state lottery, you might not worry about it so much.

Applied Cryptography, Second Edition: Protocols, Algorthms, and Source Code in C by Bruce Schneier

The only algorithm I used in commercial software was, before AES and all that good stuff, triple DES. Still pretty secure. So, unless you are a serious math wiz and security expert, don't try to "roll your own" encryption software, as others have so clearly stated here. Do it as a learning tool only. Just because you cannot break your algorithms doesn't mean others cannot.