Password 101 (part 1): hashes and salts

happygeek 1 Tallied Votes 1K Views Share

How do passwords work?

What a password isn't, or at least really shouldn't be, is some kind of secret word or phrase that is simply compared against a table of usernames in a login database. Such plaintext systems are about as secure as a chocolate padlock on a furnace door. Even a login system whereby those passwords are encrypted isn't much better, although many people assume they are safe as houses. Breaches across the years have proven how insecure any system which relies purely on reversible algorithm encryption really is. The user logs in and enters a password, this triggers the system to decrypt the password in the login database that is associated with the username and if they match then authorisation is granted. The weakness here should be obvious, and simply put if someone manages to get access to the encrypted password database itself then it can be copied and cracked offline at their leisure as has happened time and time again.

Hashing

Which leads us to the next step up the password protection ladder, and that's usage of hashing. Think of a hash as being a one-way mathematical function, an algorithm to morph the password itself into a really long number and destroys the password data in so doing. It's the hash value that is stored in the login database, and when a user enters their password the hash is decrypted to provide the 'long number' value of that password and if they match authorisation is made. This makes it much harder to crack if someone gains access to the database itself, or at least it should in theory. Unfortunately, in practise, any two identical passwords (sharing the same character string) will have the same hash value. So it then becomes a matter of the hacker looking for matching hash values within the copied login database and assuming that these are most likely going to be dictionary words, or at the very least passwords that appear in any of the readily available dictionary attack databases of popular password combinations. And there's the weakness you may think, with hacker running a dictionary attack against those identified user logins with matching hashes and, more often than not, getting lucky with a good number of them. But it's actually worse than that for the system admin and much easier for the hacker courtesy of rainbow tables.

Rainbow tables

Rainbow tables are pre-computed lists of the cryptographic hashes relating to any possible password of specific lengths from specific character sets. The smaller the length of password taken from the simplest of character sets, the quicker a rainbow table attack will disclose it. This is why you hear about making passwords 'long and complex' which really means using lots of characters from a mix of character sets such as upper and lower case alphabet, numerical, special characters and so on. Given enough computing power (and if we are talking nation states or a criminal syndicate, although many would say the two amount to much the same, that's no problem) a terabyte rainbow table can crack simple password hashes really very quickly indeed. Unless those hashes have also been salted.

Salted hashes

To overcome the problem of password hashes for the same original password (built from identical characters) sharing the same value and therefore introducing weakness into the system, salting the hash is recommended. A salt is simply a random value that is introduced into the password before encryption takes place so that when it is hashed that random value, or salt, is added and the resulting hash value is unique even for multiple identical originating passwords. Salted rainbow hash tables could be constructed, but to be of any use to an attacker these would have to be improbably large to be of any impact if the original password salting is suitably large to begin with. As such, although you cannot say that a salted hash is 100 percent secure (nothing is) in the real world these will reduce the risk of rainbow tables being effective by a considerable factor.

ryantroop 177 Practically a Master Poster

For a simple PHP example of this in "practice" have a look here:
http://www.daniweb.com/web-development/php/threads/431052/how-to-have-a-password-security#post1847238

It never did get any comments for improvement or correction. It would be interesting to see if this post drums up some new ideas.

Also, I know some languages are pushing for passwords to be hashed using crypt or bCrypt. Do you have any particular opinion on using crypt over manually salt/peppering your passwords?

Hiroshe 499 Posting Whiz in Training

One thing: Using a salted hash alone isn' nearly enough. For the sake of other peoples security, please please please don't roll out your own, unless you know what your doing, and you have people review it.

Use a Key Derivation function that an experienced cryptographer made for you. Examples include PBKDF2, or even better Scrypt (or a variant thereof).

I'm NOT saying that learning cryptography is bad for you. By all means, learn this stuff. But don't implement it youself for something important unless you invest a lot of time studying it, and you have people continuously review your code. Otherwise, I promise you that implementing it yourself is a mistake.

Hiroshe 499 Posting Whiz in Training

I'm going to elaborate more on how rainbow tables work. You've got the just of it though. (I study mathematics and computer science at university by the way, and have commited quite a bit of time into cryptography. That being said, I still find it hard sometimes to read cryptographic papers.)

Let's say you have a hashed password with a salt, and you want to crack it with a rainbow table.

First, you need to determine the password space. That is, the size of the passwords, and which characters could be possibly used.

Next, you start with a list of random passwords (with the salt applied) (say 1000000). The idea is to hash all of the passwords, 10000's of times. This is very fast if you use the GPU or specialized hardware, even on a conventional computer. You only store the password you started out with, and the final hash though. Next, you take the password, and keep hashing it until you find it on the table somewhere. That means that the passwords is probably somewhere on that stream. So, you look at the initial value, keep hasing it until you find the original hash. poof. You've found the password! This works even for password with high entropy if your using a weak Key Derivation function (such as a naked hash).

Hiroshe 499 Posting Whiz in Training

Something else interesting to add to this. This is a table presented in a paper for scrypt. It shows how much money it would cost to break passwords with in a year with various algorithms:
68f3ab39639186a500cb43fe125c8c04

ryantroop 177 Practically a Master Poster

If you look closely at what I suggested in the link, you are not simply hashing the password once with a single salt. In fact, you are hashing 10 times over a composite hash of the salt + pw + pepper, which originates as a salt + pw hash in the first place. All of that hashed with SHA256.

I cant image a rainbow table being of any use against this. Each password will have a global salt applied, followed by a custom pepper, and then hashed against itself.

As I understand it, bCrypt (etc) adds an additional layer of protection with using a guid or something explicit to the system that created it. I could, of course, be wrong on that...

I have enjoyed reading your responses! Thank you for being so open with your knowledge and hard work :)

Hiroshe 499 Posting Whiz in Training

People actively use rainbow table attacks. If you use a PROVABLE key derivation algorithm, it's not effective in practice. However, people do try to make there own ker derivation algorithms, and attackers DO abuse this.

I'm sorry. I don't know who you are. You could be an expert cryptographer that has worked in the TwoFish team in the AES. I don't know. But have you had a cryptographer verify your algorithm? Even algortihms written by professional cryptographers have been broken. Often. And people abuse this. More of then then you might like. People generally don't trust cryptographic algorithms until it's survived VERY EXTENSIVE REVIEW and A LOT of people. Trust me on this. A one man algorithm who hasn't gone though studying mathematical cryptography has little chance of creating a secure algorithm.

10 times is a very small number. Heck, truecrpypt uses 2000 times on default I beleive, and that still considered to be way too small. They can get away with it though using a slighlty different concept. But it still isn't using the full poser of key-streching. Feel free to look it up for yourself.

Even though you obscured it to a point where you can't understand it, doesn't mean it's secure. Cryptography does not use obscurity. Infact, it relies of simplicity. The idea is to make a SIMPLE algorithm, such that it can be mathematically analised (thus, use well exploered mathematical concepts) such that it is more apperent that there is no weakness. Cryptography though obscurity is a thing in the past, because it has proved time and time again to be breakable. This is why most hashes and symmectic cypthers use linear algebra. Because it's very well understood.

I actually looked at your algorithm. Compaired to the proffesional ones, it's very weak. At a glance even I can devise an attack. The people who want to break into your site are probably much smarter then me. It's not much stronger then just using a hash+salt alone. All your doing by adding on the username as a "peper" as you call is functionally extending the password. But if someone got a hold of the hash's, you better beleive that they also have access to a list of usernames. So you not extending the password much at all.

Hiroshe 499 Posting Whiz in Training

Actually, I forgot to mention something about rainbow tables (my apologies). Once you has a random password hash, you have some sort of mapping back to your keyspace. In terms of ryantroop's algorithm, it would just be the keyspace and a known username. You keep applying the hash and mapping over and over again. I hope that makes it more clear.

(it doesn't matter how the mapping works, as long as it's deterministic.)

Hiroshe 499 Posting Whiz in Training

If you wanted to though, you could make your algorithm more secure (again, i'm not making and guarentees here, use a professionally implemented algorithm) by using more memory (in a carefull way) for it, and using maybe about 20000 iterations instead. The idea with using more memory is that it makes parallelization more difficult. Your salt should probably be very large, maybe 256 bits. That at least covers a few more possible approaches. Now, often you don't need to use such a large salt like that, but thats when you're carefull about how you design your KDF. Since this is just a dirty algorithm, and I don't feel like taking a deep look into it, make it big!

For using more memory, what you can do is fill a large array with information from the hash, then use the hash it self to jump to spots in the array. That way, the person who tries to break your algorithm needs to use a lot of hardware to parallelize it.

ryantroop 177 Practically a Master Poster

Im in total agreement that security through obscurity is a fools errand.

I agree, more iterations would not hurt. 10 is a bit weak, but that was demonstrative in nature.

Regardless of that, though, I am curious what your plan of attack would be.

Without having root access to the server, and getting the global salt, how would you deconstruct the hash? Even assuming you got a database dump, and you had the "pepper," you have no way of knowing how many passes were used to create the hash, nor the salt that made it. On top of that, there is little chance of knowing the method of merging the two together.

So, without root access to the server, and seeing the process (which defeats the whole thing anyway), I don't see how you would be able to create a single table that would be able to access ALL the passwords.

I do not mean to be difficult. I am genuinely interested as this is an area I have very limited exposure to, and don't often get called upon to increase my knowledge in the area.

For the record, so you and others are not taking my proposed process as a challenge, I do employ bCrypt for hashing passwords in all but a handful of my projects.

Hiroshe 499 Posting Whiz in Training

No no, please, your not being difficult. This is constructive for everyone reading. Not only that, it's an important subject to be questionaing as much as you can! Creating and challenging cryptosystems is what is done in the real world all the time. I'm not treating it any differently then that.

Without having root access to the server, and getting the global salt, how would you deconstruct the hash?

There are plenty of real world examples of people getting root access to a server.

http://news.yahoo.com/hackers-steal-2-million-facebook-google-twitter-yahoo-041505800.html
http://www.theverge.com/2013/10/28/5038218/alleged-uk-hacker-charged-with-breaking-into-military-and-nasa
http://en.wikipedia.org/wiki/LulzSec

There are hundreds of others that span the history of computing. Most of them never reach the news (obviously target companies would want to keep it quiet).

There are a lot of dirty tricks you can use. Some of them have to do with software vunerabilities, like say heartbleed, or SQL injection. Some of the most effective have to do with spoofing and using social engineering. (do not under estimate this). Some use effective custom worms. This site has some information on other well known attacks.

If you want me to explain some more attacks in detail, go ahead and ask me.

Even assuming you got a database dump, and you had the "pepper,"

Actually, I wouldn't even need to match the "pepper" to the hash in order to break it. I can just assume it's part of the keyspace. If I could match it though, it would be very nice. The pepper really isn't doing much to begin with.

you have no way of knowing how many passes were used to create the hash,

There are plenty of ways to figure out how many passes you used. For example, if you shipped your KDF algorithm, most companies would stick with the default number of hashes (proper KDF's don't loose security, even if the attacker knows the number of passes, so generally there not changed). If I'm able to get the list of hashs and salts in the first place, theres a very good chance I can also get the source code. There are a lot of dirty techniques I can use to get it.

Another more sneaky way I can think of off the top of my head is: I'll create an account on your website. Then I'll see how many iterations it takes to get my password to the hash thats listed on the stolen list. Now I know how many passes, and I didn't even look at the code. This is an example of a chosen-plaintext attack (google it).

Another way would be to time your server. See how long it takes to respond. That'll give me a rough idea if your don't implement your program correctly. This is an example of a timing attack (google it).

nor the salt that made it.

The point of a salt is to stop parallel attacks. So generally you store the salt with the hash, and I'll be getting both. Even if you don't store them togeather, I already got the hashs, so I can probably get the salts using a simular method.

I don't think you understand how powerfull a rainbow table attack is. It's not a trivial matter trying to stop it. (actually, it is trivial. Use Scrypt.) A regular computer can easily break sha256ed passwords in minutes with a rainbow attack. There are plenty demonstrations of that. 10 iterartions is nothing. A hacker with more resouces could rent (or steal) several severs or a botnet, and have 1000x what a regular computer can do. Easy. A more organised organization might be able to produce custom hardware for the task. In the artical, he said you need a table of about a terabyte. That's not true. I beleive the number is much closer to 1 gb, or possbily less depending on how you implement it (off the top of my head). Here's a video that mgiht be able to shed some light: http://www.youtube.com/watch?v=0WPny7wk960 .

On top of that, there is little chance of knowing the method of merging the two together.

You're again assuming that I don't have access to the source code. Are you sure that is an assumption you wish to be relying on? Trusting that your software has no flaws as the only point of security is dangerous. I could easily get a job for your company, and look at the code they used for web development using a bunch of dirty methods. Or I might be able to look at it remotely using a diry method. Or get some inside information.

Modern cryptography assumes that that attcker already knows exactly what algortihm is being used, and the paremeters that are being used. It remains secure, even with that assumption. Yours is assuming that there is no way that any information could be leaked though some other means about your algorithm - I hope you can see why this is dangerous. This IS an example of security through obscurity, which you already said is a fools errand.

So, without root access to the server, and seeing the process (which defeats the whole thing anyway), I don't see how you would be able to create a single table that would be able to access ALL the passwords.

Look up SQL injection. It's a simple, yet famous example. Doesn't need root access. Beleive me, there are a lot more ways then just that.

Even if I has root access, passwords are still valuable, as I'll demonstrate.

So, using one of many standard techniques (say an SQL injection), I got a list of hashs and salts. Getting a list of usernames is usually easy. Then I've figured out the number of passes easily by testing my own password. (chosen-plaintext). Next, it's just a standard rainbow table attack. The only thing special about your algorithm is that the usernames are part of the keyspace (assuming I can't come up with another technique to match them. I probably can.). This is easily computer on a laptop GPU in a few minutes. Especially if your only using 10 passes, with very little memory usage. Now I have usernames and passwords. This is a realistic situation if peopled used their own homebrew KDFs.

Let's have some fun. Most people use the same password on a lot of there accounts/websites. See the problem? Yup, automatically test a bunch of email passwords, and create a list of the one's that worked. That's plenty of information to steal peoples identities. So, I sell them on the black market, and oh look. I made an easy $5000 bucks (more or less, depending on how big the catch is). All because you decided it would be fun to use your own homebrew KDF.

Hiroshe 499 Posting Whiz in Training

I did some more digging into the size required for a rainbow table. It's highly dependant on the method your using and your hardware. A reasonable number might be 67108864, and thats times 512 bits for a chain for a total of 4Gb. That's nothing at all. It can easily fit into ram on a laptop. (numbers from https://www.freerainbowtables.com/en/faq/)

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.