I want to split a HTML file, by the <section>
tag, into separate files.
An example might be:
mypage.html
<!DOCTYPE html>
<html>
<head>
...
</head>
<body>
<!-- Section 1 -->
<section class="foo">
...
</section>
<!-- Section 2 -->
<section class="bar">
...
</section>
<!-- Section 3 -->
...
</body>
</html>
The desired outcome would then be enumerated as so:
/mypage.html # (original file)
/mypage-split.html # (original file, with placeholders to replace the section back in)
# component/include files (that of course will not be valid HTML, since it's just a portion and won't start with `DOCTYPE` or `html`)
/sections/mypage-1.htmlinc # (section 1 markup)
/sections/mypage-2.inc # (section 2 markup)
...
/sections/mypage-n.html
How can I perform this split?
A shell script might be the easiest way, but my scripting skill is very limited.
Or, is there any web standard to keep components of HTML pages in separated files (supported by browsers or web-servers), without having to resort on a web programming language? (server or client side)
tag=section
sed -n "/<$tag>/,/<\/$tag>/p" section.inc
This should be a starting point for you:
you can specify the target HTML tag name into the tag
environment variable;
sed
will extract the file content delimited by your tag and put it into the filepath