There are times when you may need to compare two files to check if they are the same. For instance, if you have two or more servers all running the same Operating System (OS) and on one server a file is causing issues then you can see if they are different. If they are different then it may be as simple as copying the one file from the working server to the failing server. For file comparison it is possible to check both text files as well as binary files. Text Files To compare a text file we need a text file to work with. So, let’s make a new text file to use for comparison. We will create a text file which contains the help information for the ‘adduser’ command. To do this, open a Terminal and type the following command: adduser --help > adduser.1 adduser --help > adduser.2 Now we have two identical files to start with for comparison. Let’s look at comparing two files which are the same. To compare two text files enter the following command in a Terminal: diff adduser.1 adduser.2 When you press ENTER you should get back a prompt without any entries made to the screen. If no entries are listed after pressing ENTER then there are no differences between the two files. Now we need to make some differences. To do this you can open ‘adduser.2’ in a text editor of your choice. Perform a ‘Search and Replace’ to find ‘--’ and replace the double dashes with ‘---’. Save the file as ‘adduser.2’ and close the text editor. Now you need to execute the same command again and see the differences which are found between the two files. The beginning of the differences are as follows: 1,3c1,3 < adduser [--home DIR] [--shell SHELL] [--no-create-home] [--uid ID] < [--firstuid ID] [--lastuid ID] [--gecos GECOS] [--ingroup GROUP | --gid ID] < [--disabled-password] [--disabled-login] [--encrypt-home] USER --- > adduser [---home DIR] [---shell SHELL] [---no-create-home] [---uid ID] > [---firstuid ID] [---lastuid ID] [---gecos GECOS] [---ingroup GROUP | ---gid ID] > [---disabled-password] [---disabled-login] [---encrypt-home] USER The first line shows that in lines 1-3 there are changes between the files. The first ‘1’ is to show there are differences on Line 1. The last number shows the last Line or Line 3. The ‘c’ shows a ‘change’ between the files. The lines have been ‘changed’. The second to fourth lines begin with a ‘<’ which mean the first file or ‘adduser.1’. Lines six through eight begin with ‘>’ to show the second file output or ‘adduser.2’. Lines four and five were not changed since the next entry are for Lines six to eight. Delete ‘adduser.2’ and copy ‘adduser.1’ to ‘adduser.2’ so we can start with a new search. Once completed you need to run the following command: sed -i '/^$/d' adduser.2I The command will remove any blank lines from the file and save it back to the file. Now run the command: diff adduser.1 adduser.2 The output is returned as follows: 5d4 < 10d8 < 14d11 < 17d13 < 20d15 < 29d23 < In the first line it shows that line 5 was deleted in the first file (<) and is not line 4. Further on, Line 10 is deleted and is now Line 8 and so on. Again you need to delete ‘adduser.2’ and copy ‘adduser.1’ as ‘adduser.2’. The procedure can be done simply with the command: cp adduser.1 adduser.2 If you want to test the files do a ‘diff adduser.1 adduser.2’ and see they are the same. Open ‘adduser.2’ for editing in a text editor. Place the cursor on the empty line above ‘general options:’ and press ENTER then type ‘new line’. Save and close the editor. Now run a ‘diff’ command again as before. The output for the difference is: 20a21 > new line Line 20 has had a line ‘added’ to line 21 which is in the second file (>) and is ‘new line’. You can also see the lines side-by-side using the parameter ‘-y’ in the command as shown: diff -y adduser.1 adduser.2 Here you will see characters like ‘<’, ‘>’ and ‘|’ to denote changes in the two files. If you simply want an output to show that the lines are different or the same then use the ‘-q’ option. The output will either say ‘Files adduser.1 and adduser.2 differ’ or nothing will be returned if they are the same. Binary Files Binary files are files which include machine code which is understood by the computer. Where text files can be read and usually understood by people the binary files are a mish-mash of characters which are not legible by people. To use a bin file for examples you can go to ‘/usr/bin’ and find the file ‘diff’. Copy this to your HOME folder and rename it to ‘diff1.bin’. Copy it and rename the second file to ‘diff2.bin’. Compare that the copy is exact by using the command: cmp diff1.bin diff2.bin To change a binary file you should use a hex editor like ‘okteta’ or ‘hexer’. Once you have one of these installed you can edit your second bin file (diff2.bin) and make a change to one character. After making a change and running another ‘cmp’ command the results are: diff1.bin diff2.bin differ: byte 2, line 1 If you change a different byte in the binary file the results will differ. Another option is to produce a checksum for the binary file by using the command ‘md5sum’. The commands would be: md5sum diff1.bin md5sum diff2.bin The results are shown in Figure 1. You can see that the checksums are different and therefore show that the binary files are not the same. FIGURE 1 It is possible to shorten the above commands into one by issuing: md5sum diff1.bin diff2.bin The results are shown in Figure 2 which are easier to compare and see they are different. FIGURE 2 You could always place the checksum into a text file for each binary file by the commands and then compare the two text files to see if there is a difference: md5sum diff1.bin > diff1.txt md5sum diff2.bin > diff2.txt diff diff1.txt diff2.txt Practice with these file comparison programs and understand what they do and how to make them work for you.