excelxmlutf-8xslt-1.0special-characters

Issue creating a LF output from a CRLF input in an xml file for Excel using xslt


I have some XML that I want to transform into XML that Excel will open in a table.
But, some of my source data uses the string 
 to denote the CRLF and create a new line.
I understand that Excel only responds to the string 
 which is a LF only, to create a new line within a cell.
I have edited the result file to insert the string 
 and I get the desired effect in Excel, so I know its possible to do this.

I thought it would be simple to use a function like translate to swap the string [CRLF] to [LF]. But, I have had no success so far.

I'm limited to XSLT 1.0

Using some simple sample .xml borrowed from W3:

<?xml version="1.0" encoding="UTF-8"?>
<catalog>
    <cd>
        <title>Empire Burlesque</title>
        <artist>Bob Dylan</artist>
        <country>USA</country>
        <company>Columbia</company>
        <price>10.90</price>
        <year>1985</year>
        <comment>Text on one line</comment>
    </cd>
    <cd>
        <title>Hide your heart</title>
        <artist>Bonnie Tyler</artist>
        <country>UK</country>
        <company>CBS Records</company>
        <price>9.90</price>
        <year>1988</year>
    </cd>
    <cd>
        <title>Greatest Hits</title>
        <artist>Dolly Parton</artist>
        <country>USA</country>
        <company>RCA</company>
        <price>9.90</price>
        <year>1982</year>
    </cd>
    <cd>
        <title>Still got the blues</title>
        <artist>Gary Moore</artist>
        <country>UK</country>
        <company>Virgin records</company>
        <price>10.20</price>
        <year>1990</year>
        <comment>Text with a break&#13; here (crlf)</comment>
    </cd>

We can see that within a <comment> the string &#13; is used.
In the .xsl itself I've used this:

<xsl:for-each select="catalog/cd">
    <!-- TABLE CONTENTS -->
    <Row>
        <!-- Title -->
        <Cell>
            <Data ss:Type="String">
                <xsl:value-of select="title"/>
            </Data>
        </Cell>
        <!-- Artist -->
        <Cell>
            <Data ss:Type="String">
                <xsl:value-of select="artist"/>
            </Data>
        </Cell>
        ...
        <!-- Comment -->
        <Cell>
            <Data ss:Type="String">
                <xsl:value-of disable-output-escaping="yes" select="translate(comment,'&#13;','&#10;')"/>
            </Data>
        </Cell>
    </Row>
</xsl:for-each>

The header of my XSL looks like this:

?xml version="1.0" encoding = "UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns="urn:schemas-microsoft-com:office:spreadsheet" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet" xmlns:html="http://www.w3.org/TR/REC-html40">
    <xsl:output method="xml" encoding="UTF-8" indent="yes"/>
    <xsl:template match="/">
        <Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet" xmlns:html="http://www.w3.org/TR/REC-html40">

Currently the result is like this: transformed xml viewed in Notepad++ transformed xml viewed in Excel

EDIT Settings in Excel: Wrap text enabled

Where a [CRLF] symbol has been used rather than the string &#10; which would work in Excel and display the cell with the break inside it. If I edit the result to insert the text string &#10; Excel will break the text within the cell in the desired way:

transformed xml in Notepad++ viewed in Excel

I'm stuck with XSLT 1.0, am I missing something simple here?

Many thanks in advance.

edit OS is WIN, and I'm using the XML tools plugin in Notepad++ to perform the transform while testing.

I just used some webpage to test another engine: webpagetransform I can see there is a break within the field, but its not the test string &#10; and I don't believe that Excel will use it to start a new line within a cell


Solution

  • What you want is the entry for the 1st cell and not the 2nd?

    <root>
       <cell>this is&#10;a line feed</cell>
       <cell>this is
    a line feed</cell>
    </root>
    

    it may be helpful to look at this

    if your data looks like this

    <data>
        <item>
            <test>A&#13;B</test>
        </item>
    </data>
    

    this stylesheet produces the output below

    <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
        <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
    
        <xsl:variable name="lineFeed">
            <xsl:text disable-output-escaping="yes">&amp;#10;</xsl:text>
        </xsl:variable>
        <xsl:template match="/">
            <Root>
                <Cell>
                    <xsl:value-of select="substring-before(data/item/test, '&#13;')"/>
                    <xsl:text disable-output-escaping="yes">&amp;#10;</xsl:text>
                    <xsl:value-of select="substring-after(data/item/test, '&#13;')"/>
                </Cell>
                <Cell>
                    <xsl:value-of select="translate(data/item/test,'&#13;',$lineFeed)"/>
                </Cell>
                <Cell>
                    <xsl:call-template name="convertToLineFeed">
                        <xsl:with-param name="string" select="data/item/test"/>
                    </xsl:call-template>
                </Cell>
            </Root>
        </xsl:template>
    
        <xsl:template name="convertToLineFeed">
            <xsl:param name="string"/>
            <xsl:variable name="before" select="substring-before($string, '&#13;')"/>
            <xsl:choose>
                <xsl:when test="$before">
                    <xsl:value-of select="$before"/>
                    <xsl:text disable-output-escaping="yes">&amp;#10;</xsl:text>
                    <xsl:call-template name="convertToLineFeed">
                        <xsl:with-param name="string" select="substring-after($string, '&#13;')"/>
                    </xsl:call-template>
                </xsl:when>
                <xsl:otherwise>
                    <xsl:value-of select="$string"/>
                </xsl:otherwise>
            </xsl:choose>
        </xsl:template>
    </xsl:stylesheet>
    

    which outputs this

    <Root>
      <Cell>A&#10;B</Cell>
      <Cell>A&amp;B</Cell>
      <Cell>A&#10;B</Cell>
    </Root>
    

    i.e. note that the 1st line does do what you want the 2nd one doesnt

    I don't understand Michael Kay's answer, not because its wrong but because I don't have insight into the workings of the XSLT engine, so the answer is too technical for me.

    In my simplistic terms, the 1st line works because its writing the line feed directly to the output, so it sidesteps some of this "normalisation" stuff, the 2nd line the linefeed is encoded in a raw text fragment and gets normalised before its written to the output (something like that).

    So I think you CAN do what you want by writing a recursive template to do what the 1st cell example does, just recursively (I think), I'll do it now.


    I've now added the recursive solution as line 3.