Passwords and Entropy
Regardless of whether we are users or programmers, we’re all familiar with the standard annoyance of registering on a new website. You know the drill, we need a minimum password length, and of course capital letters, digits, and special symbols. In fact these days it seems that almost every site requires the use of special symbols in their passwords—things like % or # or !—and they won’t allows us to register without using at least one of these symbols. Normally these scrupulous password guardians make us included at least one digit, one special character, and one capital letter. And many programmers follow this rule blindly when building their applications. Of course, most use some prebuilt password mechanism in their sites, so it happens automatically.
But the question is, why? You don’t have to be a security expert to figure out that the idea is to decrease the chances of your password being hacked. But let’s leave behind our intuition for a second and take a quick trip into the wonderful world of information security. Assume I have a password composed of one single letter, the letter a or u for example. What is the maximum number of guesses needed in order to guess the password?
The answer is of course the number of letters in the alphabet, which stands at 26. If I chose a password and a hacker wants to crack it, he just needs to guess the letters one after the other. He’ll start with: “Is the password a?” If yes, he was successful in one try. If I chose the letter z, and the hacker goes through the alphabet in order from a to z, then the number of attempts will be 26. In other words, the difficulty level of the password is 26.
In the world of programming, we often express this password difficulty level using the term entropy. We take the maximum number of attempts needed, and put it into a logarithm (don’t be scared of the math) of 26 base 2. It sounds complicated but it’s actually easy to calculate with just Google—no need for a calculator. Just search google for:
and you’ll get a result of 4.7. This is the entropy for a password of one lowercase letter in length, or to be more exact, entropy in bits. It may sound complicated but it’s really not—all you need to remember is log(x) where x equals the maximum number of attempts needed to crack the password.
Still not convinced it’s that easy? Let’s look at another example. If our theoretical example also contains capital letters, what do you think happens? We can assume the level of difficulty went up. But by how much? Now instead of 26 attempts, we need 52 (26 lowercase, and 26 capital letters). So how much will the entropy be now?
This gives us an entropy result of 5.7. This means the password has a difficulty of 5.7 bits of entropy. Easy, right? And it’s still easy if we need to calculate the entropy of passwords that are more than one character. If our password has 6 characters, we’d need to guess 6 times. The entropy would now be:
log2(52) X 6 = 34.2
34.2 is a nice enough number, but we should extract from it the maximum number of attempts needed to crack the password. All we have to do take 2 to the power of the result. So, 234.2 roughly equals 19,770,609,663: that 19 billion, 770 million, 609 thousand maximum attempts.
Wow! That’s sounds like a lot, right? Truth is, in some cases it would only take a few minutes to crack this password. Not all attacks occur over the net. Assuming I have a password protected file that I want to crack, and it’s located on my computer (or a computer that I can control from my computer), and it’s a powerful computer, I can attempt to crack the password a million times a second. This means that within 5 and a half minutes, I can theoretically crack it.
This is the main reason sites bother us with special characters. Let’s reexamine the issue. If we make the user type at least one capital letter, one lowercase letter, one digit, and one special symbol, the amount of possibilities is: 26 lowercase letter, 26 capital letters, 10 digits, and 10 symbols. All in all, 72 characters. The entropy is 6.17. A password that is 6 characters long has an entropy of 37.019. The difference from a password with only letters doesn’t seem so dramatic, but the number of attempts needed now is 2 to the power of 37.019, or 139,314,069,508. That’s 139 billion, which is a lot more. For a computer that can make a million attempts a second, we’re talking about 38 hours. Not bad!
So yes, adding special symbols increased the security. But it’s still problematic.
What’s the problem? There are a few. The first is that despite the fact that the entropy is larger, adding symbols doesn’t make the password uncrackable. Sure, a difference of 33 hours is big, but it doesn’t make it impossible.
The second problem is that 37 bits of entropy may be great, but it’s not good enough in the age of powerful graphic processors. In the past a million attempts a second sounded like a lot (and it is), but today we’re talking about a billion attempts a second. In light of this, adding symbols to passwords seems ridiculous. We need to increase the entropy of our passwords by several orders of magnitude and a few symbols is not gonna do it.
The third problem is that adding special symbols drives the user crazy, who then likely turns to the simplest solution: writing the password on paper (best case scenario) or in some draft in their email (a lesser option). We’re simply not made to remember symbols—it just doesn’t happen.
So what can we do? How about we just let the user choose what they want—allow only regular letters (upper or lowercase, and let them use spaces if they want) but set the minimum length for the password. For instance 20 characters minimum.
The entropy level of a longer password will be much higher. In our case:
log2(26) X 20 = 94
This is assuming the user chose only 20 characters. This means our hacker will need an unimaginable number of attempts: 1.9928149e+28
We’re talking about a number of attempts that at a rate of one million per second will take many thousands of years at least. Even with the most powerful processors we’re still talking about an impossible task.
Sounds totally crazy? One of my old passwords was “pedo mellon a minno speak friend and enter”
It’s a password that’s really easy to remember. It’s “speak friend and enter” in Sindarin (that’s an Elvish language) and in English. There’s no way you’ll forget it, especially if your a Tolkien fan. And the entropy for this password is 197. The number of letters in the password (including spaces) is 42, giving us:
log2(26) X 42
With this theoretic base it’s easy to come to the conclusion that if we want heightened security, the last thing we should be doing is bothering our users with capital letters, lowercase letters, digits, and special characters. All we need to do is to set a high enough minimum number of characters to meet our entropy standards. That’s all.
So maybe the time has come that we stop bothering our users. If our systems need to meet security standards, we need only turn to science—we can set the level of entropy needed and achieve it with length.
I'll llet xkcd sum things up as only it can:
About the author: Ran Bar-Zik is an experienced web developer whose personal blog, Internet Israel, features articles and guides on Node.js, MongoDB, Git, SASS, jQuery, HTML 5, MySQL, and more. Translation of the original article by Aaron Raizen.