The script below creates a log file for all bot visits, sends me an email, and also verifies IP at ip2location. It worked just fine with PHP5.2 with the eregi function, so I modified the eregi line to preg_match and worked for a few minutes on my wamp testing server after adding forward slashes to each bot variable because I was getting a "reg_match(): Delimiter must not be alphanumeric or backslash" warning , but now it won't work and won't log any bots in the visits.log file.
The script still gives me these three warnings below, but since they were warnings and it had begun working, I didn't pay much attention to them:
<?php
error_reporting(E_ALL);
ini_set('display_errors', 1);
$to = "email@here.com";
$log = "./visits.log";
$dateTime = date("r");
$agents[] = "/googlebot/";
$spiders[] = "/Google/";
$spiders[] = "/Googlebot/";
$agents[] = "/slurp/";
$spiders[] = "/Slurp (Inktomi's robot, HotBot)/";
$agents[] = "/msnbot/";
$spiders[] = "/MSN Robot (MSN Search, search\.msn\.com)/";
$agents[] = "/yahoo\! slurp/";
$spiders[] = "/Yahoo! Slurp/";
$agents[] = "/bingbot/";
$spiders[] = "/Bing\.com/";
$ip= $_SERVER['REMOTE_ADDR'];
$found = false;
for ($spi = 0; $spi < count($spiders); $spi++)
if ($found = preg_match($agents[$spi], $_SERVER['HTTP_USER_AGENT']))
break;
if ($found) {
$url = "http://" . $_SERVER['SERVER_NAME']. $_SERVER['PHP_SELF'];
if ($_SERVER['QUERY_STRING'] != "") {
$url .= '?' . $_SERVER['QUERY_STRING'];
}
$line = $dateTime . " " . $spiders[$spi] . " " . $ip." @ " . $url;
$ip2location = "https://www.ip2location.com/".$_SERVER['REMOTE_ADDR'];
if ($log != "") {
if (@file_exists($log)) {
$mode = "a";
} else {
$mode = "w";
}
if ($f = @fopen($log, $mode)) {
@fwrite($f, $line . "\n");
@fclose($f);
}
}
if ($to != "") {
$to = "email@here.com";
$subject = $spiders[$spi]. " crawled your site";
$body = "$line". "\xA\xA" ."Whois verification available at: $ip2location";
mail($to, $subject, $body);
}
}
if ($_REQUEST["js"]) {
header("Content-Type: image/gif\r\n");
header("Cache-Control: no-cache, must-revalidate\r\n");
header("Pragma: no-cache\r\n");
@readfile("visits.gif");
}
?>
a) you have 6 elements in $spiders and only 5 in $agents which results in the warning about offset 5 and empty regular expression. Googlebot is doubled:
$spiders[] = "/Google/";
$spiders[] = "/Googlebot/";
remove one entry
b) if ($_REQUEST["js"]) {
should be replaced with:
if (isset($_REQUEST["js"])) {
and depending what value you expect there to be afterwards the isset the value should be checked - for instance if you verify against true
:
if (isset($_REQUEST["js"]) && $_REQUEST['js'] === true) {