I want to extract around 20 element types from some SVG documents to form a new SVG.
rect, circle, polygon, text, polyline, basically a set of visual parts are in the white list.
JavaScript, comments, animations and external links need to go.
Three methods come to mind:
If XSLT is the right tool for the job, what xsl:stylesheet do I need? Otherwise, which approach would you use?
Example input:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!-- Created with Inkscape (http://www.inkscape.org/) -->
<svg xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:cc="http://creativecommons.org/ns#" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:svg="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns="http://www.w3.org/2000/svg" version="1.1" width="512" height="512" id="svg2">
<title>Mostly harmless</title>
<metadata id="metadata7">Some metadata</metadata>
<script type="text/ecmascript">
<![CDATA[
alert('Hax!');
]]>
</script>
<style type="text/css">
<![CDATA[ svg{display:none} ]]>
</style>
<defs id="defs4">
<circle id="my_circle" cx="100" cy="50" r="40" fill="red"/>
</defs>
<g id="layer1">
<a xlink:href="www.hax.ru">
<use xlink:href="#my_circle" x="20" y="20"/>
<use xlink:href="#my_circle" x="100" y="50"/>
</a>
</g>
<text>
<tspan>It was the best of times</tspan>
<tspan dx="-140" dy="15">It was the worst of times.</tspan>
</text>
</svg>
Example output. Displays exactly the same image:
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="512" height="512">
<defs>
<circle id="my_circle" cx="100" cy="50" r="40" fill="red"/>
</defs>
<g id="layer1">
<use xlink:href="#my_circle" x="20" y="20"/>
<use xlink:href="#my_circle" x="100" y="50"/>
</g>
<text>
<tspan>It was the best of times</tspan>
<tspan dx="-140" dy="15">It was the worst of times.</tspan>
</text>
</svg>
The approximate list of keeper elements is: g, rect, circle, ellipse, line, polyline, polygon, path, text, tspan, tref, textpath, linearGradient+stop, radialGradient, defs, clippath, path.
If not specifically SVG tiny, then certainly SVG lite.
Dimitre Novatchev's solution is more "clean" and elegant, but if you need a "whitelist" solution (because you can't predict what content users may input that you would need to "blacklist"), then you would need to fully flesh out the "whitelist".
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"
xmlns:svg="http://www.w3.org/2000/svg">
<xsl:output indent="yes" />
<!--The "whitelist" template that will copy matched nodes forward and apply-templates
for any attributes or child nodes -->
<xsl:template match="svg:svg
| svg:defs | svg:defs/text()
| svg:g | svg:g/text()
| svg:a | svg:a/text()
| svg:use | svg:use/text()
| svg:rect | svg:rect/text()
| svg:circle | svg:circle/text()
| svg:ellipse | svg:ellipse/text()
| svg:line | svg:line/text()
| svg:polyline | svg:polyline/text()
| svg:polygon | svg:polygon/text()
| svg:path | svg:path/text()
| svg:text | svg:text/text()
| svg:tspan | svg:tspan/text()
| svg:tref | svg:tref/text()
| svg:textpath | svg:textpath/text()
| svg:linearGradient | svg:linearGradient/text()
| svg:radialGradient | svg:radialGradient/text()
| svg:clippath | svg:clippath/text()
| svg:text | svg:text/text()">
<xsl:copy>
<xsl:copy-of select="@*" />
<xsl:apply-templates select="node()" />
</xsl:copy>
</xsl:template>
<!--The "blacklist" template, which does nothing except apply templates for the
matched node's attributes and child nodes -->
<xsl:template match="@* | node()">
<xsl:apply-templates select="@* | node()" />
</xsl:template>
</xsl:stylesheet>