OK! So we can make a few guesses:
-
The bot name isn't "Unknown robot identified by bot\*", the bot name is just "bot\*". (Actually, even this is highly suspect.)
-
AWStats is tell you it doesn't recognize the bot ("Unknown robot identified by...")
-
The bot name is likely bot<something>, not a literal asterisk. I think this is AWStats telling you it matched a bot by identifying the prefix "bot", i.e.
AWStats did a substring match on 'bot\*'
-
You'll have to go awk'ing and grep'ing your access_log files (or maybe tweaking awstats?) to get the actual bot name.
If the bot name were truly "Unknown robot identified by bot\*", then
-
you don't need the parentheses, RewriteCond expects PCRE so ( ) are only needed if grouping
-
the backslash+asterisk combination is pretty much a worst-case scenario for correctly escaping , I would sidestep the issue by matching "Unknown robot identified by bot.." instead of "Unknown robot identified by bot\*". A single period "." in regex is
like a "?" in filename globbing, it matches any single character.
>From that page, however, we can guess that you might be able to just write:
RewriteCond %{HTTP_USER_AGENT} bot[\s_+:,\.\;\/\\\-] [NC]
-Adam
From: Montana Quiring <montanaq@gmail.com>
Sent: April 22, 2025 14:05
To: Continuation of Round Table discussion <roundtable@muug.ca>
Subject: [RndTbl] Re: .htaccess file: stopping robot with escape character in name
Sorry man, excuse my ignorance, but not sure what you are asking.
I got the bot name from AWstats, which I assume is just ASCII.
Regards,
-Montana
Urlencode or octal? Or if it's a regex just use ".".
-Adam
From: Montana Quiring <montanaq@gmail.com>
Sent: Tuesday, April 22, 2025 1:47:31 PM
To: Continuation of Round Table discussion <roundtable@muug.ca>
Subject: [RndTbl] .htaccess file: stopping robot with escape character in name
Hello Folks,
I'm trying to stop a bot from crawling a site using the .htaccess file. The problem is that it's using the backslash character as its name. Grrr...
It's called: Unknown robot identified by bot\*
This generates an internal server error:
RewriteCond %{HTTP_USER_AGENT} ("Unknown robot identified by bot\*") [NC]
I tried, this, but it didn't help:
RewriteCond %{HTTP_USER_AGENT} ("Unknown robot identified by bot\\*") [NC]
Any thoughts?
Regards,
-Montana