Could someone help me understand this command behavior?


New Member
Nov 2, 2022
Reaction score
So, to explain the problem, I have about a hundred or so .tsv (tab separated values) files that are being used to store information on vendors for certain client accounts. I am trying to parse these files in order to generate a pipe delimited string of each of the individual columns for an import to a different platform. Several of the files in particular are displaying a strange behavior that I can't seem to make sense of.

One such .tsv file is similarly as follows...

vendor purpose website username password additional_info
Constant Contact Email/SMTP [redacted] [redacted] empty
Digital Ocean Hosting Web Hosting [redacted] [redacted] empty
Google Analytics Misc. empty empty [redacted]
Nexcess Portal Web Hosting empty empty request invitaiton
Nexcess Siteworx Web Hosting [redacted] [redacted] [redacted] empty
Twitter Social Media [redacted] [redacted] Twitter Profile:[redacted]

So, let's say I want to focus on the sixth column, "additional_info", while dropping the header line, I can run the following...
> awk -F "\t" '{print $6}' "./Client1234.tsv" | tail -n+2

which results in...
request invitation
Twitter Profile:[redacted] is expected

Now, I want to use that and turn it into a pipe delimited string by using the following...
> awk -F "\t" '{print $6}' "./Client1234.tsv" | tail -n+2 | paste -sd '|'

which results in...
whereas what I would expect to see is this...
empty|empty|[redacted]|request invitation|empty|Twitter Profile:[redacted]

Whats strange is that if I perform the same task on any of the other columns in the file, it results in the expected behavior...

> awk -F "\t" '{print $2}' "./Client1234.tsv" | tail -n+2 | paste -sd '|'
Constant Contact|Digital Ocean Hosting|Google Analytics|Nexcess Portal|Nexcess Sitworx|Twitter
> awk -F "\t" '{print $3}' "./Client1234.tsv" | tail -n+2 | paste -sd '|'

I decided to use the 'cat' command to display all of the hidden characters to make sure that I am working with "tab" separated values and not just a bunch of space characters in a file, and sure enough the format is correct, as ^I represents the tab character in the cat command output.
> cat -A Client1234.tsv
Constant Contact^IEmail/SMTP^I^[redacted]^I[redacted]^Iempty^M$
Digital Ocean Hosting^IWeb Hosting^I^I[redacted]^I[redacted]^Iempty^M$
Google Analytics^IMisc.^I^Iempty^Iempty^I[redacted]^M$
Nexcess Portal^IWeb Hosting^I^Iempty^Iempty^Irequest invitation^M$
Nexcess Siteworx^IWeb Hosting^I[redacted]^I[redacted]^[redacted]^Iempty^M$
Twitter^ISocial Media^I^I[redacted]^I[redacted]^ITwitter Profile:[redacted]^M$
**Where it says "empty" in the file is the literal string "empty", used as a placeholder for an empty cell because I thought maybe the string "null" might have been causing this problem in earlier versions. Also, I don't think that it matters, but I'm using zsh to run the commands as opposed to a traditional bash shell

There are several other .tsv files in the list that seem to display the same behavior, but not all of them. I can't figure out why it is or what the difference is, but if anyone could help me understand what's going on here it would be most appreciated. Thanks!
Last edited:

Members online

Latest posts