Greping (or however) a huge list of IP's in one run?

None-yet

Member
Joined
Aug 10, 2020
Messages
78
Reaction score
32
Credits
906
Grepable return from Nmap. All are one line but IP is in either 11 or 12 digits. They mostly repeat the IP where port is listed and then where the status is as in the example. I need the fastest method of cleaning this by removing all but the ip and removing the duplicate listing. So basically I only need the ip only and no dups.

Here is an example:
Host: xx.x.xx.xx () Status: Up
Host: xxx.x.xx.xx () Ports: xx/filtered/tcp//ftp///
Host: xxx.x.xx.xx () Status: Up
Host: xxx.x.xx.xx () Ports: xx/filtered/tcp//ftp///
Host: xxx.xxx.xxx.xx () Ports: xx/closed/tcp//ftp///
Host: xxx.xxx.xxx.xxx () Status: Up
Host: xxx.xxx.xxx.xx () Ports: xx/filtered/tcp//ftp///
Host: xxx.xxx.xxx.xxx () Status: Up
Host: xxx.xxx.xxx.xxx () Ports: xx/filtered/tcp//ftp///

My issue is how can I do this? Will it need to be done multiple times to account for ip's with varying number lengths or can it be done in one run and if so then how?

Thanks and you (Americans) all have a great Thanksgiving!
 


Hmmm......not sure exactly what context you're trying to do this in. There may be a way to grep/sed/awk something here, I don't know. I'd write a PERL or Python script and use a regex to suck the IP out of each line and then write a new file containing the list of IPs. I used to know the regex to suck an IP address up, but it's been over ten years and I can't remember. Anybody?
Anyway the script would look something like:
Code:
foreach record in the file containing the nmap data
regex the IP address out into a var
check your array of vars to see if the new IP exists
if not then put the new IP on the array

then
traverse your array of IPs and write them to each line of an output file
 
Off the top of my head - you’re looking for the digits 0-9, repeated one to three times followed by a period . then then another one to three digits, another period, another one to three digits, another period and finally another one to three digits.

Using a regex with grep that should look something like this:
Bash:
\egrep -o "[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}" /path/to/file | sort -u > /path/to/listofIPs
The initial backslash escapes any aliases that might be set for egrep - we don’t want additional options being used with egrep that might pollute our output.
The -o option tells grep to only report exact matches.
Then there is the regex and the path to the file to grep through.
Egrep’s output is piped to sort, we’ve used the -u option to make it a unique sort - removing duplicates. And finally we redirect to an output file.

And if you have multiple files in a directory containing IP’s, you could add greps -R option (recursive) and then specify a directory instead of a single file.
Bash:
\egrep -Ro "[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}" /path/to/directory | sort -u > /path/to/listofIPs
That will search every file in the specified directory (and any sub-directories) for IP addresses and will output a sorted, unique list of IP’s.

EDIT:
I’m on my phone and not anywhere near a PC atm, so I haven’t tested it. But I’m fairly confident it’s correct!

Also, above is a quick and dirty regex, because it will accept values over 255 for the numbers.
I didn’t have time to try to work out a more robust/correct one that will only pick out valid IP addresses. However, a quick web search should yield a more accurate regex to use.
 
Last edited:
For multiple files in the same dir then is it possible to grep them all with a wildcard some way?
 
For multiple files in the same dir then is it possible to grep them all with a wildcard some way?
Yes, you can do that too!
The exact pattern to use will depend on how the files you’re interested in are named.
 
Thanks
 
I have been unable to get this to work thus far. Here is what I did with a wild card with 8 files. Never went through though.
Code:
root@kali:/media/sf_Storage_1/Master# \egrep -Ro "[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}" /*.txt | sort -u > /list-of-Ips.txt
root@kali:/media/sf_Storage_1/Master#
 
Tried this also. Inside a folder with the 8 files I was hoping to pull from.

Code:
root@kali:/media/sf_Storage_1/Master# \egrep -o "[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}" /*txt | sort -u > IP's.txt
>
 
Hmmmm, strange..... I was pretty sure that regex should work.
I'll fire up my laptop later and will try that regex myself to see if it works.

In the meantime - can you confirm your file paths are correct?

According to your snippets, the text files are in the root of the file-system. i.e. /
So the files you're searching are at /*.txt?
Is that correct?
Or did you mean to put ./*.txt - as in "all .txt files in the current working directory"?

The text files being in root looks suspect to me.


Otherwise, perhaps it's where I'm escaping the periods with backslashes in the regex?!
Normally a period is a special character in a regex - meaning "match any character". So by escaping it with a back-slash - it should be interpreted as a literal period character instead. Which is how we want the period characters to be interpreted.
So maybe we don't need to escape the periods??!..........

Either way - I'll fire up my laptop later and will give it a try. But those are just a few thoughts off the top of my head!
 
Last edited:
OK, I've tried my original regex on my laptop and it works for me.... So perhaps your file-paths were incorrect or something?! IDK.

This regex is a little better and should only let valid IPV4 addresses through:
Bash:
\egrep -o "((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){1,3}(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)" /path/to/file | sort -u > ~/uniqueIPV4List.txt

And if you need to extract any IPV6 addresses - it's a lot more complex:
Bash:
\egrep -o "(([0-9a-fA-F]{1,4}:){7,7}[0-9a-fA-F]{1,4}|([0-9a-fA-F]{1,4}:){1,7}:|([0-9a-fA-F]{1,4}:){1,6}:[0-9a-fA-F]{1,4}|([0-9a-fA-F]{1,4}:){1,5}(:[0-9a-fA-F]{1,4}){1,2}|([0-9a-fA-F]{1,4}:){1,4}(:[0-9a-fA-F]{1,4}){1,3}|([0-9a-fA-F]{1,4}:){1,3}(:[0-9a-fA-F]{1,4}){1,4}|([0-9a-fA-F]{1,4}:){1,2}(:[0-9a-fA-F]{1,4}){1,5}|[0-9a-fA-F]{1,4}:((:[0-9a-fA-F]{1,4}){1,6})|:((:[0-9a-fA-F]{1,4}){1,7}|:)|fe80:(:[0-9a-fA-F]{0,4}){0,4}%[0-9a-zA-Z]{1,}|::(ffff(:0{1,4}){0,1}:){0,1}((25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])\.){3,3}(25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])|([0-9a-fA-F]{1,4}:){1,4}:((25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])\.){3,3}(25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9]))" /path/to/file | sort -u > ~/uniqueIPV6List.txt
I found the above regex in a web-search and plugged it into the \egrep command. It's horrible to read, but it works nicely!
 

Members online


Latest posts

Top