Hashing for fun - improving password security
No, not this…
But this…
Stay with us non-techies! This post is for everyone…
What is Hashing?
Hashing algorithms are mathematical processes that take an input, jumble it all around, and produce an output. That output is called a hash. These algorithms have a number of features:
• The output is unpredictable - A change of 1 binary digit can produce a completely different output, and you can’t guess what that is – you have to go through all the steps to get there.
• They are non-reversible – You can’t take an output, and figure out what the input was.
A common use of hashes is passwords. We don’t want to store user passwords in the clear because if some wicked soul with access to where we’re storing them wanted to, they could steal and sell all those passwords.
If we store the hash, though, then we don’t know their password. Then, when the user tries to log-in again, we’ll take their password, hash it, and compare it against the hash we’ve stored. If the two match, then that password – whatever it was – must be correct.
However, not all hashing algorithms are made equal, or remain so over time. For example, an old format, called “MD5” was regarded as a secure, but was ‘broken’ – that is, shortcuts were found that made its output predictable, thus defeating its security.
Recently, the same has happened for the more modern hashing algorithm called “SHA1” (which is used by a lot of websites. Actually, it was a team at Google who were able to predict most of the steps of the hash, and so only had to try a few different inputs to get a “valid“ (i.e. matching) output.
SHA1 is what Sitecore uses...
However, it’s not a critical issue. Although ‘broken’ and now regarded as ‘weak’, SHA1 still takes a massive amount of computation to break (think major corporate and nation-state).
"This attack required over 9,223,372,036,854,775,808 SHA1 computations. Yes, that’s Nine quintillion, two hundred twenty-three quadrillion, three hundred seventy-two trillion, thirty-six billion, eight hundred fifty-four million, seven hundred seventy-five thousand, eight hundred & eight computations. This took the equivalent processing power as 6,500 years of single-CPU computations and 110 years of single-GPU computations"…so if you happen to have 6500 computers in your house, you might be able to crack the SHA1 algorithm in 365 days.
Time to move to something better…(getting more technical now)…
Despite the fact that SHA1 is still difficult to break, it’s time to move to something better. Sitecore has offered the following:
When you create a new website, you must change the weak default hash algorithm (SHA1) that is used to encrypt user passwords to a stronger algorithm.
To change the hash algorithm:
• Open the web.config file and in the node, set the hash AlgorithmType setting to the appropriate value. We recommend SHA512.
Ok, that seems sound… except what about my systems that already have users in them? Like, say, any project that isn’t a brand-new Sitecore instance? Any existing user’s passwords will have been stored with the old hashing algorithm, and so their passwords will fail...
Well, it turns out that we’ve a nice example of how to do fall-back in the membership provider: https://gist.github.com/kamsar/6407742
It tries to log someone in using the configured algorithm, and if it fails it tries again with SHA1. Thus, new users and users who change their password can benefit from more secure hashes, but everyone can still log in.
I’ve tested this on a local instance; it works nicely.
KEY TAKEAWAYS:
• All new systems should use SHA512. It’s easy to do this at the start of a project.
• Older systems may want to add a fall-back Membership provider, and update the algorithm they’re using to SHA512