How to deal with a sudden intensive bot traffic from many ASNs/IPs to a vBulletin4 forum + Cloudflare: ?s=*

amfzn

New Member
Joined
Mar 30, 2021
Messages
13
Reaction score
2
Credits
179
Hello, I wanted to share my experience how I have rather successfully eliminated intensive bot traffic from many ASNs and subnets using free Cloudflare account + I would like your feedback on what that traffic is and if i could do it better way, maybe even without Cloudflare...

My vbulletin4 forum website went down due to a CPU limit reached on a shared hosting account.
According to a logs, the reason was many visitors, sometimes several per second.
They did not come from the same IP or User-Agent, nor similar subnets /16 or /24 but many ASNs, most visits (according to Webalizer stats - i have pulled these into Calc app and sorted) I have noticed from:
HostPapa
RackNerd
Web2Objects

104.223.0.0/16
HostPapa crawler, 20+ IPs from this range
18 Dec, 2025 12:42:09
107.172.0.0/16
RackNerd crawler, tens of IPs from this range
18 Dec, 2025 12:43:31
107.173.0.0/16
RackNerd crawler, tens of IPs from this range
18 Dec, 2025 12:45:02
107.174.0.0/16
RackNerd crawler, tens of IPs from this range
18 Dec, 2025 12:45:02
107.175.0.0/16
RackNerd crawler, tens of IPs from this range
18 Dec, 2025 12:45:02
128.241.0.0/16
Above average visit count, Tianfeng (Hong Kong) Communications Limited
17 Dec, 2025 11:00:17
170.244.0.0/16
some crawling, tmpdel
19 Dec, 2025 21:17:29
172.245.0.0/16
HostPapa crawler, 20+ IPs from this range
19 Dec, 2025 21:17:29
192.3.0.0/16
HostPapa crawler, 20+ IPs from this range
19 Dec, 2025 21:07:09
104.168.0.0/17
HostPapa crawler, 20+ IPs from this range
19 Dec, 2025 21:17:29
142.147.128.0/17
Web2Objects LLC crawling, challenge
19 Dec, 2025 21:17:29
192.210.128.0/17
HostPapa crawler, 20+ IPs from this range
19 Dec, 2025 21:07:09
192.227.128.0/17
HostPapa crawler, 20+ IPs from this range
19 Dec, 2025 21:07:09
198.46.128.0/17
HostPapa crawler, 20+ IPs from this range
19 Dec, 2025 21:07:09
45.41.128.0/18
Web2Objects - Tens of crawler visits
19 Dec, 2025 20:59:36
84.37.192.0/18
Crawler tens of IPs. This is huge block of IPs. Very big range, do not block, just challenge.
19 Dec, 2025 20:30:22
64.188.0.0/19
HostPapa crawler, 20+ IPs from this range
19 Dec, 2025 20:51:47
167.160.32.0/19
Web2Objects LLC crawling, challenge
19 Dec, 2025 21:17:29
72.11.144.0/20
HostPapa crawler, 20+ IPs from this range
19 Dec, 2025 20:31:16
96.44.144.0/20
HostPapa crawler, 20+ IPs from this range
18 Dec, 2025 12:54:38
96.44.176.0/20
HostPapa crawler, 20+ IPs from this range
...
...

I had no better budget friendly idea than setting Interactive challenge (captcha) for visits which seemed resource intensive and rarely used by regular visitors. These rules were set at https://dash.cloudflare.com/idhere/mysite.com/security/security-rules
like this:
"When incoming requests match…"
Field = URI Path, Operator = wildcard, Value = /tags.php*
Field = AS Num, Operator = equals, Value = 36352

Full expression:
(http.request.uri.path wildcard r"/tags.php*") or (http.request.uri.path wildcard r"/search.php*") or (http.request.uri.path wildcard r"/sendmessage.php*") or (http.request.uri.path wildcard r"/register.php*") or (http.request.uri.path wildcard r"/activity.php*") or (http.request.uri.path wildcard r"/calendar.php*") or (http.request.uri.path wildcard r"/faq.php*") or (http.request.uri.path wildcard r"/showgroups.php*") or (http.request.uri.path wildcard r"/archive/index.php*") or (http.request.uri.path wildcard r"/mobile.php*") or (http.request.uri.path wildcard r"/mobile.php*") or (ip.src.asnum eq 36352) or (http.request.uri wildcard r"/.php*s=")

at that page /security/security-rules, i have also set another rule for Interactive challenge for IPs/subnets listed in https://dash.cloudflare.com/idhere/configurations/lists/idhere :
Field = IP Source Address, Operator = is in list, Value = blockedips
(blockedips is name of my list)

what really made a change in my logs (in terms of a traffic reduction) was that last rule: http.request.uri wildcard r"/.php*s="

Sample traffic before applying the rule:
179.235.87.36 - - [29/Dec/2025:21:30:18 +0100] "GET /index.php?s=d9156d0e71431276fe9e3f375222e502 HTTP/1.1" 200 27959 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.6943.141 Safari/537.36"
179.0.162.226 - - [29/Dec/2025:21:30:21 +0100] "GET /index.php?s=9ba0a69b334cbdde9ef46b7cc959a4d2 HTTP/1.1" 200 27955 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.6943.141 Safari/537.36"
38.51.29.88 - - [29/Dec/2025:21:30:21 +0100] "GET /index.php?s=7bc3c05d42ca95e9c61b841bfc507787 HTTP/1.1" 200 27957 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.6943.141 Safari/537.36"
17.241.75.86 - - [29/Dec/2025:21:29:24 +0100] "GET /showthread.php?123-abc!&s=367ac76fa620368b749e5e8dbe120d94 HTTP/1.1" 403 1755 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.4 Safari/605.1.15 (Applebot/0.1; +http://www.apple.com/go/applebot)"
93.180.217.30 - - [29/Dec/2025:21:29:51 +0100] "GET /showthread.php?896-abcde=&p=134388&s=f9413ba7a4694200bb8aa1c9e96ad435 HTTP/1.1" 200 17776 "-" "Mozilla/5.0 (compatible; MSIE 6.0; Windows NT 6.2; Trident/4.0)"
am I blocking legitimate traffic using that s= rule? I have used it because when I am browsing the site as a human, i do not see these s= in URLs nor in logs near my visits.

Note that also as a next measure to reduce bot flood I am using .htaccess firewalls:
and

and having set crawl-delay in robots.txt (most bots does not respect, but such bots may be at least reported)
 
Last edited:


Have you seen this:

 
Thanks, I actually had that already enabled. Today I have seen another flood that bring the site down, studying last visits (web server access log), like 95% or more visits was from single User-Agent: https://user-agents.net/string/mozi...e-112-0-0-0-safari-537-36-agency-94-8-4097-98

So in Cloudflare, site, /security/security-rules I have further updated described ruleset adding:
or (http.user_agent contains "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/112.0.0.0 Safari/537.36")
as described, this ruleset action is Interactive challenge (captcha).
After that, site started loading. I have also added another uncommon URI parameters seen in access log.

I am glad CF exist, I would have to be looking for similar web firewall which serve captcha based on URL or UA. Question is if i would be able to do it reliably not to block good search engine bots. If you know about CF alternative solution which does this, please mention it. I know that there is https://anubis.techaro.lol/docs/ which does Proof Of Work apparently based on URL, UserAgent, though the configuration may not be easy.
 


Follow Linux.org

Members online


Top