grepping files

Diputs · May 23, 2024

I was wondering how to perform the below query using Bash code,
but I don't need all code written out exactly, just get the idea how to :

The question is: I'm looking for any file, or files, that contain multiple words.
I don't need the name of files of which have a line that contains all of these words, like this:

grep word1 file.txt | grep word2 | grep word3

I need the name of the file or files that have ALL of the words, but ANYWHERE INSIDE the file.
( So, not in the same line necessarily )

I would think GREP would be needed, but I'm not sure how to use it then.

dos2unix · May 23, 2024

You could try...

grep -Rnw '/path/to/somewhere/' -e 'word1' -e 'word2' -e 'word3'

You don't need the filename if you're looking at all the files.

dcbrown73 · May 24, 2024

You could probably write that in awk.

osprey · May 24, 2024

Here's a preliminary attempt to output the names of the files that have all the searched words ... @Diputs objective AIUI:

Here are some files to work on and their contents. The number in the filename corresponds to the number of words in the file:

Code:

[tom@min ~]$ ls
file3  file4  file6  file7

[tom@min ~]$ cat file3
red
green
blue

[tom@min ~]$ cat file4
red
green
blue
yellow

[tom@min ~]$ cat file6
red
green
blue
yellow
black
white

[tom@min ~]$ cat file7
red
green
blue
yellow
black
white
brown

Running the @dos2unix command:

Code:

[tom@min ~]$ grep -Rnw * -e 'red' -e 'green' -e 'black'
file3:1:red
file3:2:green
file4:1:red
file4:2:green
file6:1:red
file6:2:green
file6:5:black
file7:1:red
file7:2:green
file7:5:black

Extracting the filenames that contain the search items:

Code:

[tom@min ~]$ grep -Rnw * -e 'red' -e 'green' -e 'black' | awk -F: '{print $1}'
file3
file3
file4
file4
file6
file6
file6
file7
file7
file7

Omitting repeated filenames, with the output prepended by the original number of the same file name:

Code:

[tom@min ~]$ grep -Rnw * -e 'red' -e 'green' -e 'black' | awk -F: '{print $1}' | uniq -c
      2 file3
      2 file4
      3 file6
      3 file7

The output shows 2 files had the 3 searched terms. There's more that could be done. How would this scale to a large number of files?

Diputs · May 24, 2024

Need to have a try here

Diputs · May 31, 2024

dos2unix said:
You could try...

grep -Rnw '/path/to/somewhere/' -e 'word1' -e 'word2' -e 'word3'

You don't need the filename if you're looking at all the files.

That's OK if you want to count but I'm only interested if all of the words appear in a file, or not. If not all words are present in a file, it's no good.

Diputs · May 31, 2024

dcbrown73 said:
You could probably write that in awk.

Maybe ... but I need to look at the functions. I'm aware of AWK, and I actually use it a lot, but often with the default actions.
Need to have a look.

Diputs · May 31, 2024

osprey said:
Here's a preliminary attempt to output the names of the files that have all the searched words ... @Diputs objective AIUI:

Here are some files to work on and their contents. The number in the filename corresponds to the number of words in the file:

Code:

[tom@min ~]$ ls file3 file4 file6 file7 [tom@min ~]$ cat file3 red green blue [tom@min ~]$ cat file4 red green blue yellow [tom@min ~]$ cat file6 red green blue yellow black white [tom@min ~]$ cat file7 red green blue yellow black white brown

Running the @dos2unix command:

Code:

[tom@min ~]$ grep -Rnw * -e 'red' -e 'green' -e 'black' file3:1:red file3:2:green file4:1:red file4:2:green file6:1:red file6:2:green file6:5:black file7:1:red file7:2:green file7:5:black

Extracting the filenames that contain the search items:

Code:

[tom@min ~]$ grep -Rnw * -e 'red' -e 'green' -e 'black' | awk -F: '{print $1}' file3 file3 file4 file4 file6 file6 file6 file7 file7 file7

Omitting repeated filenames, with the output prepended by the original number of the same file name:

Code:

[tom@min ~]$ grep -Rnw * -e 'red' -e 'green' -e 'black' | awk -F: '{print $1}' | uniq -c 2 file3 2 file4 3 file6 3 file7

The output shows 2 files had the 3 searched terms. There's more that could be done. How would this scale to a large number of files?

I can see this working but scalability is not the best - Still, I like this solution.

grepping files

Diputs

Active Member

dos2unix

Well-Known Member

dcbrown73

Well-Known Member

osprey

Well-Known Member

Diputs

Active Member

Diputs

Active Member

Diputs

Active Member

Diputs

Active Member

Members online

Latest posts