Secure Password Collisions


Eric Hansen

“A secure password is a happy password.”

Just like most things in the computer world there’s a thousand different opinions on what constitutes a “secure password”. Some say mixed case, others will say random keys hit, shoot I used to think alphanumeric was just fine since I could type about 100 words a minute.

What I’ve found though mostly to work is to have actual words but separate them by something (spaces are always fun). You could then change the spacing characters if you wish to reuse the password so that way its still different but you know it.

While MD5 isn’t the safest hash to check against lets use it for a small understanding. Say we have the phrase “bob loves mary”. If we run the phrase against md5sum we have this hash: 5bda40cc07ec7fc7ef50fa289fadc42b.

Now we will change the spacing characters: “bob#*(1loves98)#mary” which gives us the hash 33b22b26cde32da82900267a522f8198. I wrote a small Python script to iterate through each position to determine where each hash matches. Here’s the results:

$ ./ 33b22b26cde32da82900267a522f8198 5bda40cc07ec7fc7ef50fa289fadc42b
> Found match at position 10: e
> Found match at position 19: 0
Found 2 / 32 characters in same position (0.00%)
This basically shows that there is nearly 0% possibility of knowing that the passwords are similar by looking at the hashes themselves. For one last run through we’ll make slight changes to the original. So now the phrase will be “bob loves mary” (two spaces instead of one). A hash of 49d0dc6c8284b3704e777b5854996476 gives us this interesting result:

$ ./ 33b22b26cde32da82900267a522f8198 `echo "bob  loves  mary" | md5sum | awk '{print $1}'`
> Found match at position 24: 5
Found 1 / 32 characters in same position (0.00%)
While this method proves in no way that the passphrases are truly secure it goes to show you how even a simple change in how you use your password across systems can throw off the entire thought of two hashes being similar to each other.

For fun I will share the Python script I wrote to do this:

#!/usr/bin/env python

import sys

h1 = sys.argv[1]
h2 = sys.argv[2]

count = 0

if len(h1) == len(h2):
        for i,c in enumerate(h1):
                if h2[i] == c:
                        print "> Found match at position %d: %c" % (i,c)
                        count += 1

print "Found %d / %d characters in same position (%.2f%%)" % (count, len(h1), count / len(h1) * 100)


  • slide.jpg
    54.8 KB · Views: 36,304


But this raises another question. I'll ask you because you sound like you know about these things.

We're constantly told that a long password is more secure than a short password. And if the attacker KNOWS the password is short, I agree a brute force attack might be plausible.

But If I pick a 6-character password (including at least one non-alpha to fend off dictionary attacks, and not using anything about myself), how, oh how can the attacker know it's not a 57-character password?

So is a longer, harder-to-remember password really better than a shorter, easier-to-remember password? After all, this hash looks just as scary as yours:
$ echo hash00 | md5sum -
fb9b55bf10862f88d75af8e143ae9e80  -
Depends on a few things.

Most short, easy-to-remember passwords are vastly more likely to be something like "password" or the name of a pet or something. The longer, harder-to-remember password is usually something like "234*@U(*fjkasdfhjl234$#@$".

If you want the best of both worlds though, even though I've ran into a few sites that don't allow it, put spaces in the password. So you can pick an easy to remember password that also adds some additional security. Its easier to remember "mary hopkins blurbs" than it is the 234... password before.