One way to do it, without having to install specialist tools is to
unzip
the file to stdout in the terminal, pipe it to
grep
and then count the number of occurrences with
wc
:
Bash:
unzip -ca /path/to/file.odt 2> /dev/null | \grep --binary-file=text -oi "searchTerm" | wc -l
Where
/path/to/file.odt
is the path to the .odt file you want to search inside. Replace that with the path to the file you want to search in. And where
"searchTerm"
is the text to search for. This can be a single word, a literal string, or a regex (regular expression).
Breaking the command down:
In the
unzip
command, we use the
-c
and
-a
options.
-c
unzips the file to stdout.
-a
converts it to a text file (albeit binary text)
/path/to/file.odt
is the path to the .odt file.
And
2> /dev/null
hides any error messages by redirecting them to
/dev/null
, AKA the bit-bucket.
The unzipped and converted file is output to stdout and then piped to
\grep
. NOTE: The backslash
\
in the invocation of
grep
is deliberate. This will escape any aliases that might be set up for
grep
, ensuring that the only parameters that
grep
receives are the ones we explicitly specify.
--binary-file=text
tells
grep
that any binary data should be treated as text.
The
-o
option tells grep to only display the matching part of the line, instead of the entire line. This is because the binary data for the text in the file will be in a single, long, continuous line. Allowing us to see each individual result.
The
-i
just specifies to use a case-insensitive search. This is an optional thing. You can add any other grep options that you might need.
Then we have
"searchTerm"
- which is the text to search for. Obviously replace that with the actual word, text, or regular expression you want to search for in the file.
That should list all occurrences of the word, or pattern in the unzipped file.
Finally, we pipe the output from
grep
to
wc -l
to tell us how many lines of output
grep
returned, which tells us the number of occurrences of your word/text/pattern in the file.
But that will only work on .odt documents and other .od? files produced by Libreoffice, it won't work on .docx.