P
postcd
Guest
Hello,
please can i use any command to output Google SERP (Search Results Page) somehow to the screen or into file?
I mean example this is Google search query page:
https://www.google.cz/#q=+intext:"my phrasse" other phrasse
the serch phrasse is: +intext:"my phrasse" other phrasse
my aim is to get 10 URLs (results google found)
so i can further work with these URLs
i tried to curl above search page URL, but got 403 error from Google
Thank you
---
update: i think it needed user agent to be defined, this worked to output URLs from google serp, but it malformed some more difficult URLs with + & signs and spaces.
curl -sLA "User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12" --connect-timeout 5 --max-time 10 http://www.google.com/search?q=search phrasse here | grep -ahoP 'http[-a-zA-Z0-9@:%_\+.~#?&//=]{2,256}\.[a-z]{2,4}\b(\/[-a-zA-Z0-9@:%_\+.~#?&//=]*)?' | grep -v "google"
have to be plus signs in search phrasse, like search+phrasse+here
please anyone having an idea on proper regex to extract URL from serp?
Update: this one works!:
lynx --dump http://www.google.com/search?q=search+phrasse+here | grep -o '?q=http.*&sa' | awk -F'?q=|&sa' '{print $2}' | grep -v "google"
please can i use any command to output Google SERP (Search Results Page) somehow to the screen or into file?
I mean example this is Google search query page:
https://www.google.cz/#q=+intext:"my phrasse" other phrasse
the serch phrasse is: +intext:"my phrasse" other phrasse
my aim is to get 10 URLs (results google found)
so i can further work with these URLs
i tried to curl above search page URL, but got 403 error from Google
Thank you
---
update: i think it needed user agent to be defined, this worked to output URLs from google serp, but it malformed some more difficult URLs with + & signs and spaces.
curl -sLA "User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12" --connect-timeout 5 --max-time 10 http://www.google.com/search?q=search phrasse here | grep -ahoP 'http[-a-zA-Z0-9@:%_\+.~#?&//=]{2,256}\.[a-z]{2,4}\b(\/[-a-zA-Z0-9@:%_\+.~#?&//=]*)?' | grep -v "google"
have to be plus signs in search phrasse, like search+phrasse+here
please anyone having an idea on proper regex to extract URL from serp?
Update: this one works!:
lynx --dump http://www.google.com/search?q=search+phrasse+here | grep -o '?q=http.*&sa' | awk -F'?q=|&sa' '{print $2}' | grep -v "google"
Last edited: