wordpress.htaccesshttp-redirecthttp-status-code-301

Regex 301 RedirectMatch in .htaccess not working


Some time ago my WordPress site was hacked and was redirecting all clicks from Microsoft and Google search results to a some kind of Japanese auto part store. It was happening for few weeks, during which google Search Console recorded a lot of 404 urls for my website. A lot, if not all, of those 404 urls have some gibberish in them and ending with some digits. Here are few examples

/40epolymitosis1548027
/9c44d4sunspottedness1569986
/37awald1559460

I would like to add a 301 regex redirect for all usrls that end with five or more digits to a homepage. This is what I have in .htaccess file

<IfModule mod_rewrite.c>
RewriteEngine On
RedirectMatch 301 .*\d{5,} /
</IfModule>

But that is not working. By the way, I added the regex redirect using Yoast plugin. In my mind this RedirectMatch 301 .*\d{5,} / should redirect all usrl which have 5 or more digits at the end, but it is not working. Any thoughts/suggestions' on why it's not working?


Solution

  • From an SEO perspective it makes no sense to 301 redirect URLs that were already "correctly" returning a 404 status, to the homepage. Google will only treat these as a soft-404 at best and potentially cause additional problems. A 404 was already the correct response. You could perhaps return a "410 Gone" if these did, at some point, return content with a "200 OK" response status.

    The only reason to do anything in .htaccess (and return a 404 "early") is to prevent the request being handled by WordPress, which might put unnecessary load on your system.

    Since WordPress is already using mod_rewrite (as part of the core WP code block to rewrite URLs to the front-controller), you need to also use mod_rewrite to handle these other URLs. RedirectMatch is a mod_alias directive and consequently is processed later, despite the apparent order of the directives in the config file.

    So, to force a 404 for any URL that ends in 5 digits (or more) then you would add the following at the very top of the .htaccess file, before the # BEGIN WordPress comment marker.

    # Force a 404 for all URLs that end in 5 digits or more
    RewriteRule \d{5,}$ - [R=404]
    

    You do not need the <IfModule> wrapper and you do not need to repeat the RewriteEngine On directive (since that already occurs later in the file as part of the default WordPress code block).

    The L flag is not required here.

    To return a "410 Gone" instead then change R=404 to G. (To return a "403 Forbidden" then use the F flag instead.)


    RedirectMatch 301 .*\d{5,} /
    

    As mentioned above, the RedirectMatch belongs to mod_alias, not mod_rewrite. mod_alias is processed later. Since the regex is not anchored, .*\d{5,} matches any URL-path that simply contains at least 5 contiguous digits. The .* prefix on the regex is not required.