apache.htaccesshttp-redirectmod-alias

Strip parent categories from url


I'm struggling to fix an issue with 301 redirects and .htaccess. I have moved a site from an old domain to a new domain. And I have successfully managed to do this with a 301 redirect. Like so:

Redirect 301 / https://newdomain.com

On the old site child category URLs are like this:

olddomain.com/product-category/parent-cat1/parent-cat2/child-cat

or

olddomain.com/product-category/parent-cat1/child-cat

or

olddomain.com/product-category/child-cat

Whereas on the new site they are:

newdomain.com/product-category/child-cat

Unfortunately, this is resulting in 404s from the redirects. Is there any way to remove the parent categories (which can vary by name and amount of them) from the URL?


Solution

  • Try including the following RedirectMatch directive before your existing Redirect directive:

    RedirectMatch 302 ^/([\w-]+)/(?:[\w-]+/)+([\w-]+)$ https://newdomain.com/$1/$2
    

    The RedirectMatch directive is complementary to the Redirect directive, both part of mod_alias. Except the RedirectMatch directive uses regex to match the URL-path, whereas Redirect uses simple prefix-matching.

    This assumes that the path segments (ie. "product-category", "parent-cat" and "child-cat") consist of just the characters a-z, A-Z, 0-9, _ and - (hyphen). This needs to be as specific as possible so as not to match "too much". One or more "parent-cat" are required.

    $1 is a backreference to the first captured group in the pattern. ie. ([\w-]+), the product-category. And $2 is a backreference to the second captured group, ie. ([\w-]+) at the end of the pattern, the child-cat. The (?:....) "group" in the middle is a non-capturing group, so there is no backreference that applies to this.

    This is a 302 (temporary) redirect. Change it to a 301 only when it is working OK. It is easier to test with 302s since they are not cached by the browser. Consequently, you'll need to make sure your browser cache is clear before testing.