I have more than 1,000 individual files in the format https://example.com/archives/<NNNNNN>.php
where N
is an integer.
I need to keep the existing structure as-is, though I can add to it. In this case, each page has a unique data-url-title="<url-friendly-title>"
.
I would like my .htaccess
config to determine what file was read, use regex to extract the url-friendly-title
, and rewrite the final URL to replace <NNNNNN>
, in the format https://example.com/archives/url-friendly-title.php
(I would then strip .php
from it—but I didn't get that far).
# Enable rewriting
RewriteEngine On
# Match the old archive URL
RewriteCond %{REQUEST_URI} ^/archives/([0-9]+)\.php$
# Check that the file exists
RewriteCond expr "-f '%{DOCUMENT_ROOT}/archives/%1.php'"
# Extract data-url-title from the file content
RewriteCond expr "file('%{DOCUMENT_ROOT}/archives/%1.php') =~ /data-url-title=\"([^\"]+)\"/"
# Rewrite the URL to the new friendly URL
RewriteRule ^archives/([0-9]+)\.php$ /archives/%1.php [R=301,L]
Any attempt to use this configuration causes my entire website to throw an internal server error with the log message RewriteCond: bad flag delimiters
.
I don't think I have any spaces in my regex.
I've converted my .htaccess
file to Unix endings.
I have looked at numerous answered questions, and they seem to indicate that, in Apache 2.4.x, it's possible to read a variable from an external file by using RewriteCond expr "<expression>"
:
I can't get this to work. What more:
htaccess tester throws Unsupported TestString: expr
for both RewriteCond expr
statements.
.htaccess check specifically claims RewriteCond: bad flag delimiters
for the second RewriteCond line.
I've tried postpending s
to RewriteCond expr "file('%{DOCUMENT_ROOT}/archives/%1.php') =~ /data-url-title=\"([^\"]+)\"/s"
for newline matching, but the result is the same: internal server error in the browser; RewriteCond: bad flag delimiters
in the logs.
Well, because rewriting the page URL based on a variable within it, only by using .htaccess
, seems unworkable . . . here's what I did:
First things first, I check for the required pattern and pass it to the redirect.php
handler script. However—here's the trick—I also pass the rewritten URL to the handler script to ensure that anyone using the old-style URL will still get a valid result.
# Enable rewriting
RewriteEngine On
# Redirect numeric IDs to friendly URLs
RewriteCond %{REQUEST_FILENAME} -f
RewriteRule ^archives/(\d+)\.php$ /archives/redirect.php?id=$1 [R=301,L,QSA]
# Handle friendly URLs
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^archives/([a-z0-9-]+)/?$ /archives/redirect.php?title=$1 [L,QSA]
Here's where the fun begins. The comments in the code explain how the script works, but the point is that now:
https://example.com/archives/friendly-name
works correctlyhttps://example.com/archives/000001.php
still redirects to example.com/archives/friendly-name
, for backwards-compatibility<?php
// Iterate through all available `<NNNNNN>.php` files and check
// whether file contains the `data-url-title` attribute
function findMatchingFile($fileFormat, $isId = false) {
$files = glob(__DIR__ . "/*.php");
foreach ($files as $file) {
if (basename($file) === 'redirect.php') continue;
$content = file_get_contents($file);
if ($isId) {
if (basename($file) === $fileFormat . '.php') {
if (preg_match('/data-url-title="([a-z0-9-]+)"/', $content, $matches)) {
return array($file, $matches[1]);
}
}
} else {
if (preg_match('/data-url-title="' . preg_quote($fileFormat, '/') . '"/', $content)) {
return array($file, $fileFormat);
}
}
}
return false;
}
// Get input from query parameters
$input = isset($_GET['id']) ? $_GET['id'] : (isset($_GET['title']) ? $_GET['title'] : '');
// Sanitize input
$input = trim(preg_replace('/[^a-z0-9-]/', '', strtolower($input)), '-');
// Find file matching input: if searchin by ID and matching file is found,
// redirect to archive URL; if searching by title, include matching fild
if (!empty($input)) {
$result = findMatchingFile($input, isset($_GET['id']));
if ($result) {
if (isset($_GET['id'])) {
header("HTTP/1.1 301 Moved Permanently");
header("Location: /archives/" . urlencode($result[1]));
exit();
} else {
include($result[0]);
exit();
}
}
}
header("HTTP/1.0 404 Not Found");
echo "404 Not Found";
?>