A cut above

dos2unix

Well-Known Member
Joined
May 3, 2019
Messages
3,526
Reaction score
3,292
Credits
31,543

Using the cut Command in Linux​

The cut command in Linux is a powerful tool for extracting sections from each line of input, typically from a file. It can be used to cut parts of a line by byte position, character position, and field (column) delimiter. This article will explain the differences between the -c (character position) and -f (field) options with the -d (delimiter) option, and provide examples of how to use them.

Character Position (-c)​

The -c option is used to specify character positions. This is useful when all columns in a file are of equal length. For example, if you have a file where each line contains a fixed-width format, you can use the -c option to extract specific characters.

Example:

Suppose you have a file data.txt with the following content:

12345
67890
abcde
fghij

To extract the first three characters of each line, you can use:

Code:
 cut -c 1-3 data.txt

Output:

123
678
abc
fgh

Field (-f) with Delimiter (-d)​

The -f option is used to specify fields (columns) based on a delimiter. This is useful when columns have different lengths but are separated by a common delimiter. The -d option specifies the delimiter character.

Example:

Suppose you have a file data.csv with the following content:

name,age,city
Alice,30,New York
Bob,25,Los Angeles
Charlie,35,Chicago

To extract the second field (age) from each line, you can use:

Code:
 cut -d ',' -f 2 data.csv

Output:

age
30
25
35

Whitespace as a Delimiter​

Whitespace in a text file refers to spaces, tabs, and other non-printing characters. You can use whitespace as a delimiter by specifying it in quotes.

Example:

Suppose you have a file data.txt with the following content:

name age city
Alice 30 New York
Bob 25 Los Angeles
Charlie 35 Chicago

To extract the second field (age) using space as a delimiter, you can use:

Code:
 cut -d " " -f 2 data.txt

Output:

age
30
25
35

Complex Example with Multiple Delimiters​

Sometimes, you may need to handle files with multiple delimiters. In such cases, you can pipe one cut command to another.

Example:

Suppose you have a file data.txt with the following content:

name:age|city
Alice:30|New York
Bob:25|Los Angeles
Charlie:35|Chicago

To extract the city names, you can first cut by the | delimiter and then by the : delimiter:

Code:
 cut -d '|' -f 2 data.txt | cut -d ':' -f 2

Output:

city
New York
Los Angeles
Chicago

Conclusion​

The cut command is a versatile tool for text processing in Linux. Use the -c option for fixed-width columns and the -f option with the -d delimiter for variable-width columns. Remember that whitespace can be used as a delimiter by specifying it in quotes. For more complex scenarios, you can pipe multiple cut commands together.
 


When cutting, either by character position or by delimited field, you can omit either the beginning or the end of the range to get "all characters (however many there are) starting with the fifth" or "all fields (however many there are) after the third"

Code:
$ # first 3 chars
tc@dolly:~$ echo "loki:toki:smoki:juki:plop" |cut -c-3
lok

$ # 3rd character through end of input/line
$ echo "loki:toki:smoki:juki:plop" |cut -c3-
ki:toki:smoki:juki:plop

$ # first 2 fields
$ echo "loki:toki:smoki:juki:plop" |cut -d: -f-2
loki:toki
 
$ # 2nd field through end of input/line
$ echo "loki:toki:smoki:juki:plop" |cut -d: -f2-
toki:smoki:juki:plop

Before I figured this out, I used to use something like cut -c 10-1000 to make sure I would get everything through the end of the line... but, of course, with no guarantee the input wouldn't be longer than a thousand characters.
 


Members online


Top