wc -m behaves in unpredictable way

S

Sergii

Guest
Hello everyone.
Does anybody know why there are such unexpected results at the end of execution of two below commans
########################
$ echo "12345" | wc -m
6
########################
$ echo "12345" >> numbers.txt
$ cat numbers.txt | wc -m
6
########################
Exctually i do believe there should be 5 instead of 6.
 


OP
R

rstanley

Guest
'1', '2', '3', '4', '5', '\n'

It embeds a newline character in the file, and that is counted as a legitimate character in the total count. Use the app, 'hexdump -C' to display the data in the file. Use your package manager to install hexdump and/or hexedit to view and/or edit the file in hex.

A single newline at the end of any text file is standard for Linux.
 
Last edited:
OP
S

Sergii

Guest
Actually we can use echo command in conjunction with "-n" option to ommit counting the end of the line symbol:

echo -n "12345" | wc -m
 
OP
R

rstanley

Guest
Does anybody know why there are such unexpected results at the end of execution of two below commans
...

Actually we can use echo command in conjunction with "-n" option to ommit counting the end of the line symbol:

echo -n "12345" | wc -m
Yes, that's fine also, but the commands DID act as expected as you originally wrote them.
 
OP
S

Sergii

Guest
Have some more queries regarding the said problem.
Here some steps i have done:
$ hexdump -C echo.txt
00000000 23 21 2f 62 69 6e 2f 73 68 0a 65 63 68 6f 20 22 |#!/bin/sh.echo "|
00000010 31 32 33 34 35 22 20 7c 20 77 63 20 2d 6d 0a |12345" | wc -m.|

$ hexdump -c echo.txt
0000000 # ! / b i n / s h \n e c h o "
0000010 1 2 3 4 5 " | w c - m \n
000001f

As you can see there is no end-of-line symbol in the middle of the coomand. It appears at the end of the line in both cases. So this looks very strange and confusing.
 
OP
J

JasKinasis

Guest
Have some more queries regarding the said problem.
Here some steps i have done:
$ hexdump -C echo.txt
00000000 23 21 2f 62 69 6e 2f 73 68 0a 65 63 68 6f 20 22 |#!/bin/sh.echo "|
00000010 31 32 33 34 35 22 20 7c 20 77 63 20 2d 6d 0a |12345" | wc -m.|

$ hexdump -c echo.txt
0000000 # ! / b i n / s h \n e c h o "
0000010 1 2 3 4 5 " | w c - m \n
000001f

As you can see there is no end-of-line symbol in the middle of the coomand. It appears at the end of the line in both cases. So this looks very strange and confusing.
What you did there was put the command into a script and hexdumped the script. The newlines you are seeing there are the newlines you entered at the end of each line in the script.

Try this instead:
Code:
echo "12345" > out1.txt
echo  -n "12345" > out2.txt
This redirects the output of the two echo commands to files out1.txt and out2.txt.

Now use hexdump to view the two files as hex...
First file:
Code:
hexdump -C ./out1.txt
should yield the following:
Code:
00000000  31 32 33 34 35 0a                                 |12345.|
Note the 0a (newline/\n) at the end

Second file:
Code:
hexdump -C ./out2.txt
Which should yield this:
Code:
00000000  31 32 33 34 35                                    |12345|
Note there is no newline at the end of the file.

So using the -n switch as a parameter to the echo command prevents echo from adding a newline to the end of its output.

So now if you use the following commands:
Code:
wc -m ./out1.txt
wc -m ./out2.txt
You see the following results:
Code:
6 out.txt
5 out2.txt

Understand now?
 
$100 Digital Ocean Credit
Get a free VM to test out Linux!

Linux.org Hosting Donations
Consider making a donation

Staff online


Top