Thursday, February 2, 2012

Passwords

FLG received an email a week or so ago that his data at Zappos.com was compromised:
First, the bad news:

We are writing to let you know that there may have been illegal and unauthorized access to some of your customer account information on Zappos.com, including one or more of the following: your name, e-mail address, billing and shipping addresses, phone number, the last four digits of your credit card number (the standard information you find on receipts), and/or your cryptographically scrambled password (but not your actual password).

THE BETTER NEWS:

The database that stores your critical credit card and other payment data was NOT affected or accessed.

Once FLG heard the credit card data wasn't accessed he breathed a sigh of relief. However, FLG thinks most people might not want to just yet. Zappos recommended, in fact they might even have forced, customers change their passwords in response.

Wait, didn't they just say that the hackers only got "your cryptographically scrambled password (but not your actual password)?" Yes, but what exactly does that mean?

Most websites, if they have any clue at all, store passwords in their databases using one-way hashes. There are a few of these, the most common probably being MD5, SHA-1, SHA256, and SHA512. The first two aren't considered secure anymore, so hopefully Zappos was using SHA256 or SHA512. Think of a one-way hash as a machine that takes any amount of data you want to throw at it from a simple password to whole files and on the other side comes a fixed length of output unique to whatever you threw into the machine.

There are two features that make these one-way hashes useful. First, the output is kind of like a fingerprint. A small change in the input results in a large change in the output. Oftentimes on the internet, even today but especially in the old days of spotty internet connections, people would post the hash of the file that they are offering for download so that you could run a hash on the copy you'd downloaded to verify it was an exact, untainted replica. Second, there's the one-way part, which means it should be trivial to take some input (password, file, etc) and generate a hash, but impossible to go the other way (take the hash and generate the password or file). This will probably make more sense with some examples.


FLG typed "password" into this hash generator and got the following MD5 hash:

Original text: password
MD5: 5f4dcc3b5aa765d61d8327deb882cf99

Again, MD5 is not secure for password hashing use, but still makes the point and generates short values that are easy for FLG to post.

Okay, so what Zappos is saying is that the hackers didn't get the plaintext word "password", but instead got something akin to 5f4dcc3b5aa765d61d8327deb882cf99. Once a hacker has a password hash, they would typically run a dictionary through whatever the hash algorithm is, in this case MD5, generating a output of each word and comparing to the hash they have. So, they'd take the word "apple" and generate 1f3870be274f6c49b3e31a0c6728957f. That doesn't match 5f4dcc3b5aa765d61d8327deb882cf99, so they keep going.

It might sound like this would take a long time, but it doesn't. Computers can run through millions or tens of millions of these things in no time. And then there's the problem of precomputation. There's no reason a hacker has to wait until after they have compromised a website to generate hashes. They could simple generate a password hash for every word in the dictionary BEFORE the hack, save it to a file, and then once they have the hashes they can search through the file to see if they have one that matches.

To combat this precompution threat, people salt passwords. A salt is a random value that is added to the password. FLG added a user named dummy to the linux computer he has in his house to give you an example. Dummy's password is password, and here's his password information:
dummy:$6$kAkbkLsi$bsepfjdr87GZNikGZcc/OveT/akVbzZGaaXsjg5qSa2vV4NpKym6Rg6UOLNdXy3thUqy7PZ7PNi81q9J1DVJ30

Breaking this down -- The character $ is a separator. dummy is obviously the user name. 6 indicates that SHA512 is the hash algorithm. kAkbkLsi is the salt. bsepfjdr87GZNikGZcc/OveT/akVbzZGaaXsjg5qSa2vV4NpKym6Rg6UOLNdXy3thUqy7PZ7PNi81q9J1DVJ30
is the hash.

The salt works like this. The computer takes the password, in this case the word "password" and then either prepends or appends it to the password before running it through the hash algorithm. So, let's say prepend. This means that rather than password being hashed by itself, instead kAkbkLsipassword is hashed and the output is that crazy long string of random junk. This gets around the precomputation issue because now the hashes of the dictionary the hacker has are useless. The hacker has to go back and rehash every word in the dictionary with kAkbkLsi tacked onto the front. This still won't take that long of a time, but at least they can't do it before hand.

FLG isn't worried about the hacker getting his password because he uses a password vault program and generates random passwords for each website. So, his password at Zappos was something like 45dSIjnGkR98AQMIYhx9, which the hackers will never get. For example, this website says it would take, and FLG isn't kidding, about 89 quintillion years. And even if they did get it, it's totally useless. He's changed his Zappos password to something just as random and that password has no relation to his password at amazon or ebay or paypal. They are all totally random.

On the other hand, if FLG's password was a dictionary word or a dictionary word with a couple numbers and maybe a symbol tacked on the end, and he used that same password at Zappos and Amazon and his bank's website, then he'd be way more concerned. FLG thinks Zappos should not only force customers to change their passwords, which they did, but also strongly recommend that customers who are using that same password at other websites change those as well.

2 comments:

The Ancient said...

What I don't understand is this ...

If that same site says a given password would take a desktop PC 600 years to compromise, is that a bad thing? (And if it is, what should the goal be?)

FLG said...

It's all about safety margins.

I looked at the code that is used to calculate the time it takes. The actual calculation (your 600 years) appears to be based around 250 million tries per second.

Also, and this is important, the website is factoring in Moore's law, which says computing power doubles.

That 250 million per second is a pretty fast desktop. They could really rig it out with multiple graphics cards and get maybe another 2-4X times that. That's what I'd consider a probable threat. Time to crack would be in the 150 year range. If they threw ten high-end computers at it which is a reasonable threat, then we are down to 15 years.

If it were in the hands of some Russian hacker with a a botnet, huge network 10-100 thousand of compromised computers at their disposal, then maybe it's crackable in a week or so. But then again that's a less likely threat.

There are a couple of passwords that I actually remember. The ones that I don't really care about too much are around 15 years. The ones I do care about are in the tens of thousands of years. The password to my password vault is 1 septillion years.

 
Creative Commons License
This work is licensed under a Creative Commons Attribution-No Derivative Works 3.0 United States License.