php.htaccessbotsrobots.txt

Disallow robots can be bypassed with htaccess?


I have a simple question. Let's say that I have this in robots.txt:

User-agent: *
Disallow: /

And something like this in .htaccess:

RewriteRule ^somepage/.*$ index.php?section=ubberpage&parameter=$0

And of course in index.php something like:

$imbaVar = $_GET['section']
// Some splits some whatever to get a specific page

include("pages/theImbaPage.html") // Or php or whatever

Will the robots be able to see what's in that html included by the script (site.com/somepage)? I mean... the URL points to an inaccessible place... (the /somepage is disallowed) but still it is redirected to a valid place (index.php).


Solution

  • Assuming the robots will respect the robots.txt, then it wouldn't be able to see any page in the site at all (you stated you used Disallow: /.

    If the robots however do not respect your robots.txt file, then they would be able to see the content, as the redirection is made server side.