Have you tried Diceware yet?

atanere

Well-Known Member
Joined
Apr 6, 2017
Messages
2,956
Reaction score
3,287
Credits
3,266
Hi all,

I briefly mentioned Diceware back in another thread about password managers, and I continue to be intrigued by this simple method of generating a very secure passphrase that is also very easy to remember. The concept is quite simple... using a single dice rolled 5 times (or a set of 5 dice) to generate a 5-digit number. That number is compared to a large numbered word list to match a single word. (The list is 7776 words, which also contains some numbers and symbols.) But to be secure, many words need to be used, so the process of rolling the dice is repeated until at least 6 words (usually) have been pulled from the list to create a passphrase. The cryptographic science is over my head, but the author, Arnold G. Reinhold, recommends 6 words for most uses these days, and more words for more security. His trademarked system is well described at the link above and on his FAQ. His web page and Diceware word list are also available is many languages.

Part of the strength of the Diceware system is that the rolling of dice is truly random. He therefore discourages the use of computer number generation as the quality of the number generator may be suspect. I understand that, yet I was still motivated to create a simple method for my own use. I am no programmer by anyone's stretch of the imagination, but with many hours of trial and error, I have actually created a fairly simple Bash script that will take user input to generate a 6-word, 8-word, or 10- word passphrase using the Diceware word list. If you are interested, please download the attached zip file and give it a test. The details of its operation are described in comments of the script. I accept that the number generator provided by Bash may have some shortcomings, but the passphrases seem sufficiently random for me. If you have concerns about this, then don't use it.

While my script is very basic, if you'd like to see some professional programming skills performing a similar task, check out Glenn Rempe's web page for a nicer interface, more security, and more options.

Any feedback about my script or Diceware in general is welcome.

Cheers
Stan
 

Attachments

  • dice1.0.zip
    28.5 KB · Views: 714


Neat little script there Stan!

There are a few improvements that could be made here and there.
There are some bits of repetition that could be refactored into loops and a few other things I've spotted.

I hope you don't mind, but I've had a quick play with your script and boiled it down to this:
(dice2.sh in the attached zip)
Code:
#!/usr/bin/env bash

# Generates random passphrases using the Diceware(TM) Word list

# Diceware(TM) is a trademark of Arnold G. Reinhold. See more at http://www.diceware.com.

# WARNING! The Diceware(TM) author strongly discourages the use of electronic number generators such as this script, as the quality of the random number cannot be certain. He is right, of course. Do not use this script if you have any concerns about the security of the passphrase it generates. If you have extremely high security needs, use real dice. For my use, the script seems to be sufficiently random enough. The creation of this script was just for fun and learning.

# Dice version 1.0 completed 12 March 2018 by Stan Vandiver http://www.linuxgeeks.us. Copyright (C) 2017-2018
# Additional edits by Jason Trunks https://notabug.org/JasKinasis

# Quit with an error message
function die {
   echo >&2 "ERROR:" "$@"
   exit 1
}

# Generate a 5 digit random number based on the roll of a 6-sided dice
function get_word_code {
   digits=[]                  
    for digitcount in {0..4}; do
       digits[$digitcount]=$(( RANDOM % 6 + 1 ))
   done
   echo -n "${digits[@]}" | sed 's/ //g'
}

clear
printf "\nPlease visit diceware.com for more info about passphrase security.\n"
printf "\nHow many words should the passphrase contain? "
read number_of_words
printf "\n"

# Ensure user entered a valid number
number_of_words=$(echo "${number_of_words}" | awk '/^[0-9]+$/')
[[ $number_of_words ]] || die "Invalid number entered..."

printf "Dice\tWord list Match\n" > passphrase.txt

# Generate words
count=0
while [[ $count -lt "${number_of_words}" ]]
do
    \grep "$(get_word_code)" dicelist.txt | sed 's/   //g' >> passphrase.txt
   (( count++ ))
done

# Display results
cat passphrase.txt
printf "\n"
#rm passphrase.txt
# Uncomment the line above if you want the file to automatically delete itself after generating each passphrase.

For the sake of brevity in the post here, I've removed some of the longer-comments at the top of the file which describe the operation of the script, but I've kept your original copyright/licensing information in - in case anybody else copies code from this post. Full comments will be in the versions in the .zip I've attached.

So what changes are in the above?
Lets run through it:
First up we have a couple of functions.
The first function "die":
Code:
# Quit with an error message
function die {
   echo >&2 "ERROR:" "$@"
   exit 1
}
This is called in the event that an error condition occurs. All it does is output an error message (passed in by the caller) and exits with the value 1 - indicating that an error has occurred.

The second function replaces get_digit and get_a_word.
Code:
# Generate a 5 digit random number based on the roll of a 6-sided dice
function get_word_code {
   digits=[]                  
    for digitcount in {0..4}; do
       digits[$digitcount]=$(( RANDOM % 6 + 1 ))
   done
   echo -n "${digits[@]}" | sed 's/ //g'
}

In the above, we create an empty array called digits.
We then perform 5 iterations in a for loop - counting from 0 to 4.
- each iteration creates a digit in the array.
The reason we count from zero is because arrays are indexed from 0.

As output from the function, we echo the entire digit-array, passing it through sed to strip out the spaces that get put between each value.

By using the modulus operator on the value from $RANDOM we have made the pseudo-random number generation more efficient. Because we are simulating the roll of a 6 sided die, we use modulus 6 to derive our digit.

The modulus operator divides the random number (from $RANDOM) by 6 and displays the remainder - which will yield values between 0 and 5. Because we want values ranged between 1 and 6, we simply add 1.

That way you don't need the tr and awk calls in the get_digit function and there will be no need for the inefficient "until...do...done" loops that were in your get_a_word function, because you are now guaranteed to get a pseudo-random value between 1 and 6 every single time.

This simple change has taken out a huge chunks of duplicate code and has removed the need for writing to the digits.txt file - so there are less disk operations being performed too.

The next part is quite straightforward:
Code:
clear
printf "\nPlease visit diceware.com for more info about passphrase security.\n"
printf "\nHow many words should the passphrase contain? "
read number_of_words
printf "\n"

# Ensure user entered a valid number
number_of_words=$(echo "${number_of_words}" | awk '/^[0-9]+$/')
[[ $number_of_words ]] || die "Invalid number entered..."
When I was looking at the original code, I thought - Why only 6, 8 or 10 word passphrases? Wouldn't it be better to allow the user to simply enter the number of words to generate?

So here we ask the user to enter the number of words they wish to generate and then perform some checks on the user-entered value.

The line:
Code:
number_of_words=$(echo "${number_of_words}" | awk '/^[0-9]+$/')
Checks whether the users input was purely numeric. If the entire line is numeric, the value is unchanged, but if there are any non-numeric characters, $number_of_words will be set to blank/"".
The next line checks that we still have a value - if $number_of_words is blank, we call the die function and pass an error message.

Now we are on the home-straight. The final part of the script generates the passphrase in passphrase.txt and cats it to the screen:
Code:
printf "Dice\tWord list Match\n" > passphrase.txt

# Generate words
count=0
while [[ $count -lt "${number_of_words}" ]]
do
    \grep "$(get_word_code)" dicelist.txt | sed 's/   //g' >> passphrase.txt
   (( count++ ))
done

# Display results
cat passphrase.txt
printf "\n"
#rm passphrase.txt

The first printf command writes a header to the passphrase file "passphrase.txt".
Then we set up a counter "count" and a while loop to generate our required number of words.
The magic which generates each word is contained in a single grep command inside the loop.
There is quite a lot going on in that single line:
Code:
    \grep "$(get_word_code)" dicelist.txt | sed 's/   //g' >> passphrase.txt

Firstly - we are calling \grep (not grep) - this is to escape/avoid using any grep aliases that the user might have set up that could affect the formatting of greps output. That way we know we aren't going to get line-numbers added to the output from grep.

The first parameter to grep is "$(get_word_code)" which is the captured result from a call to the "get_word_code" function. So before grep is actually called, get_word_code is called to create the random code-number for the word.
The second parameter to grep is the path to the dicelist.txt which is the file containing our wordlist.

The net effect of the first part of the line is to generate a 5 digit code which is used as the search pattern in a grep of the wordlist.

The line matched by grep is passed/piped to sed, to strip the leading spaces before the code-number and the remainder of the line is appended to passphrase.txt via output redirection.

We then increment our counter and continue to generate random words in this manner until we have the required number of words.

Finally, the content of "passphrase.txt" is displayed on-screen.

Here is example output from running the modified version of the script:
Code:
Please visit diceware.com for more info about passphrase security.

How many words should the passphrase contain? 8

Dice    Word list Match
32456   head
34614   joust
13311   bandit
56435   tansy
64624   xyz
55136   spa
11516   akin
25444   filch

Note: The Dice-numbers are no longer duplicated in the output.
The previous version output like this:
Code:
Dice    Word list Match
22333   22333   dane
41231   41231   lug
# snip snip

If the original output was intentional and you want it to continue to appear as above, you can make the following changes (as per dice3.sh in the .zip) :

1. In get_word_code, pipe the output from sed to tee:
Code:
# Generate a 5 digit random number based on the roll of a 6-sided dice
function get_word_code {
   digits=[]                  
    for digitcount in {0..4}; do
       digits[$digitcount]=$(( RANDOM % 6 + 1 ))
   done
   echo -n "${digits[@]}" | sed 's/ //g' | tee -a passphrase.txt
}
The addition of the pipe to tee -a will append the word-code to passphrase.txt

2. In the \grep command in the main while loop, remove the call to sed:
Code:
# Generate words
count=0
while [[ $count -lt "${number_of_words}" ]]
do
    \grep "$(get_word_code)" dicelist.txt >> passphrase.txt
   (( count++ ))
done

Those 2 changes will yield the original output.
I just thought it looked a bit odd having the number repeated twice. But it's your script, so I'll leave that decision up to you.

It probably took me a lot longer to write this post and explain what I did than it did to actually make the changes to your script.

I had a bit of fun looking at the code and making my "improvements" - if they can be called that. At some point, I'm sure someone else will weigh in with things that could be done better still!

Anyway - I've uploaded a zip containing your original script, the word-list, plus two versions with my changes.

dice2.sh contains my original changes, without the duplication of the dice-numbers

dice3.sh contains my changes, but retains the original output format.

Feel free to use, or ignore whatever you like, after all it's your script.

If there's anything you don't understand, or want further clarification on, feel free to give me a shout!
 

Attachments

  • dice1.01.zip
    43.5 KB · Views: 683
Neat little script there Stan!
Thanks Jason! But... I gotta say... WOW!!! :D:D:D Your edits are awesome! And of course it highlights the difference between a real programmer and, well, me! :D:D:D I thought you might tweak it, and I greatly appreciate it. Your note script was what motivated me to finally debug the errors I was getting, and I knew that the diceware concept could be much better implemented... but I truly lack the knowledge/skill to get much further on my own than what I made.

If there's anything you don't understand, or want further clarification on, feel free to give me a shout!
Thanks! I may take you up on that. I'm looking forward to reviewing it more carefully and understanding better how you accomplished things differently. It's certainly not too late for this old dog to learn some new tricks, but programming has never been a skill that I could claim.

For any readers out there who are interested, please download @JasKinasis' dice1.01.zip instead (and run dice2.sh)... it's great! :D:D:D

Cheers
 
Thanks Jason! But... I gotta say... WOW!!! :D:D:D Your edits are awesome! And of course it highlights the difference between a real programmer and, well, me! :D:D:D I thought you might tweak it, and I greatly appreciate it. Your note script was what motivated me to finally debug the errors I was getting, and I knew that the diceware concept could be much better implemented... but I truly lack the knowledge/skill to get much further on my own than what I made.

Well, sharing knowledge is the main reason I am here. There haven't been many programming related problems on here in a while, so your script was a rare chance for me to take a look at someone elses code and suggest some improvements.
It's certainly not too late for this old dog to learn some new tricks, but programming has never been a skill that I could claim.
I agree - it's never too late to learn. And don't run yourself down. Your script did what you wanted it to and was an achievement in itself. It does show that you have some talent with programming. It is evident from the code that you had found and fixed several problems with your original script during its development.

For example you realised that there was a possibility that your random digit generation could yield an invalid, or empty value, so you used the until...do...done loops to ensure that your code kept going until it did generate valid values for the digits. That in itself was no small feat. It showed a lot of lateral thinking.

As the old saying goes - there is always more than one way to skin a cat. The fact that your approach was a little more complicated than mine is irrelevant. Your script still does the job you intended it to do. So it's not that my way is necessarily any more correct than yours. Mine was just a little more direct.

The more scripts you write and the more problems you have to resolve/overcome in the process - the more shell-scripting tricks you will learn and the better your scripts will become.
 
As the old saying goes - there is always more than one way to skin a cat.

And that much I've learned for sure! It seems that rarely in computing, in general, is there only one way to do something... and programming is even more so. It was a fun experience for me though, and satisfying that I got the darn thing to work at all! :D I've just picked up a Raspberry Pi lately, and it seems like a good target to try a few more things as time allows. Retirement is coming up for me in about a year, and I'm looking for a few things to occupy my mind... as long as I still get my nap time in! :D:D

Cheers
 

Staff online


Top