Password Storage Lessons from the Trenches

Starfleet Academy recommended you but you didn't hash my password!?

I love to read about security from a software architecture perspective because one of my hobbies is to test different theories that attempt to improve upon current API designs. My ultimate goal is to build a framework that is extremely lightweight on memory and lines of code, while empowering developers as much as possible to be able to handle even the toughest of business rules. I spent a good amount of time doing research on the focused topic of password authentication, from testing encryption algorithms, comparing password hashing methods, and studying up on some of the largest security issues plaguing today’s websites. Without going into specifics, here is a high-level summary on how you might improve your process of handling passwords.

Step 1: Encourage your users to create better passwords

In a previous blog post, I listed 10 reasons why users have bad password practices, which include rampant password reuse, short password lengths and little complexity. I also suggested that you at least require users to have passwords 10-16 characters long, and to provide a password strength meter and a list of hints as a way to encourage them to create stronger passwords.

Yahoo registration example

Step 2: Assume that your users have compromised email accounts

Email methods of account recovery are generally considered insecure. The majority of people frequently reuse their passwords between services. All it takes is for one of those services to get hacked for an attacker to then be able to access their email account. If you have to support an account recovery feature, ask users at the time of registration additional points of data so that when they forget their passwords you can confirm who they are by 1) what they own and 2) what they know, a means of multi-factor authentication. For example, ask them for their mobile phone number during registration so that if they forget their password down the road you can send them a text message with a code that allows them to get to the next step of the password reset process. From there, ask them for something they should know, such as their account number, date of birth, or the address you have on file with them, just in case someone happened to steal their phone. If everything checks out, only then do you allow a user to change their password.

Step 3: Transmit data over a secure connection

You should never transmit sensitive information over unsecured connections. Wifi connections are the most vulnerable, as someone may unwittingly register for your service from a laptop inside a coffee shop, where another person could be recording all of the information being broadcast at the same time. It’s a good way to let users get their passwords stolen before they even gets to your server. At a minimum, you should redirect any attempts to access your registration or login pages from HTTP to HTTPS. The best connection at this time is done over TLS using the latest version of the protocol, but otherwise you should at least use SSL. I’m not going to go into the details here, but typically you’ll have to purchase a certificate to do this, though free options do exist. It may be worthwhile to do this anyway: As an added benefit to using HTTPS, you’ll now be ranked higher in Google’s search results.

Step 4: Generate salts for your hashes (User and table levels)

Instead of storing user’s passwords in your database, or worrying about how to encrypt them, you should be hashing them instead and storing those. Then when a user types their password to login at a later time, you turn it into a hash as well and then compare that with the one you stored from earlier. If they match, the user typed the same password. Keeping the original passwords in your data store is considered such a security no-no that people have started to publicly shame websites that seem to hang on to their users’ plaintext passwords instead of hashing them.

Currently, the industry standard advice is to at least hash each password with a cryptographically secure salt you generate for each new password credential, as well as a system-wide salt that you keep outside of your database in case only your database is compromised, but not your code. This offers reasonable protection against rainbow tables, where a hacker compares each hash against a pregenerated list of hashes, as well as the birthday attack, which stems from the high probability that two people will otherwise have the same hash if they both share the exact same password.

Step 5: Run your salts and passwords through a password hashing algorithm

More precicely, you should use a password hashing algorithm to create your hashes, which are designed to overcome one of the problems of other hashing methods: they’re too fast. Hackers now have the ability to calculate billions of hashes per second using common computer components to be able to “recover” the original passwords. Using one of these methods greatly slows down the process so it would take exponentially longer to discover the passwords than if it hasn’t.

Generally accepted password hashing algorithms include scrypt, BCrypt and PBKDF2. I originally chose PBKDF2 for its endorsement by the National Institute of Standards and Technology (NIST). It works by rehashing generated hashes of the original string thousands of times, intentionally slowing down the computation by a matter of fractions of a second. The OWASP Foundation, a security research group, recommended using SHA-256 for the hashing algorithm with as many iterations as realistically possible, citing an example that Apple uses PBKDF2 with 10,000 iterations for iTunes passwords. This adds challenges to hackers by forcing them to know how many iterations you hashed the passwords at as well as both salts you used from the previous step before being able to recover all of the passwords in the table. Note that to cope with ever-increasing hardware capabilities, you should be adding more iterations each year to keep up.

Note: I’m starting to lean more towards BCrypt these days simply for the noticeable lack of hardware miners the use the BCrypt algorithm. SHA-based and Scrypt-based miners are getting cheaper and more available due to the availability of cryptocurrencies that are hashed with those algorithms. Devices that improve BCrypt cracking speeds are still in the research phases. I foresee that in the future it won’t be about weakness in password hashing algorithms so much as it is the speed at which cheap hardware can crack it through billions or trillions of hash attempts per second.

Step 6: Encrypt sensitive data

We already have laws that require sensitive data to be encrypted, such as tax information, credit card numbers, and healthcare records. After several businesses in the healthcare industry got hacked, the Department of Health & Human Services (HHS) started providing searchable wall of shame to increase awareness.

Let’s take this further and suppose hackers figured out someone’s recycled username and password from one of the leaked databases of user accounts such as this, this, this, or even this. While they may have enough information to login to your website, it’s not enough to constitute a system-wide account breach. However, what would happen if that same hacker got hold of your entire users table through a simple SQL injection attack? Without encrypting the usernames, they could cross-reference their usernames with the other lists and have a list of passwords the users likely used. With that, they’re already well on the way to brute-force attacking the rest of the string, your two salts. By encrypting the usernames, email accounts, and their associated salts they will have to guess the whole thing from scratch...effectively decreasing the chance someone can reverse engineer multiple passwords beyond what they could already glean from other hacked databases.

Step 7: Limit failed login attempts

So now you have effectively increased the time it takes to verify a password by fractions of a second to a second or more, depending on your system and business needs. What happens when a malicious attacker tries to run a dictionary attack against your own login page by automating hundreds of login attempts per second through your Web interface? It’s possible that the extra load would cripple your server, appearing to be a kind of denial-of-service attack. Instead, have your server log the number of failed attempts per user account, and failed attempts per IP address for a given time period. If there is a large number, such as 10-20 failures per minute, you should effectively lock the account until they can perform the secondary validation for the lost password checks I mentioned in step 2, or at least throttling the attempts depending on the number of failed logins in a row. That way it dramatically slows down hackers’ login attempts, freeing up your server for your genuine customers. I would also like to mention here that CAPTCHA validations are illegal in most places as they are deemed discriminatory against disabled people. They are also relatively simple to solve through automated means.

Step 8: Secure your website

Most of the work so far is focused around protecting your user’s credentials in the event that a hacker manages to obtain a large list of your users. As Benjamin Franklin said, “An ounce of prevention is worth a pound of cure.” In that light, spend time buttoning up holes in your security to prevent that from happening in the first place. I recommend starting with the OWASP Top Ten Project, a compilation of common security flaws ranked by urgency based on prevalence of attacks and how critical a successful attack against that vector can be. For each security risk, it covers a description of an attack on it as well as a guide to how you can prevent it. And don’t forget to keep your operating systems and software up to date.

Step 9: Keep up to date with security news

Things are constantly changing in the security world. For example, over 17% of all of the secure servers on the Internet were believed to be vulnerable to an attack on the OpenSSL software, later known as Heartbleed. It was basically allowing hackers to literally expose the contents of the memory state of your server, exposing raw passwords as they were being entered. It was since patched, but other flaws and data leaks are being exposed all the time. If you don’t already have one, you should get your company to hire a security consultant to audit your website and networks for potential vulnerabilities.

Step 10: Be ready for anything

Security breaches happen all the time, even despite developers’ best efforts. You need to have a plan in place for tackling a data breach before it happens so you can minimize the damage. The first thing you should do is add a flag to each user’s account that you can enable if you think they may have been compromised, which would disable their login credentials, automatically making them have to authenticate through alternative means, upon which they should be required to change their passwords. The new password should include new salts that you know weren’t exposed in the hack, and you can version the password hashes appropriately if you like. You also need to be able to reach out to your users within the first 24 hours if you can, and hopefully within the same business day.

In conclusion, password handling and storage are technically complicated, though getting it wrong could mean serious problems for your PR department should your website get hacked. This was only a high level list, and as such I do not consider this comprehensive. As time permits, I’d love to begin to break down these steps into more detail in future articles as well as providing actual source code with which you may incorporate into your projects.