sql-serverxmlt-sql

Include control characters in my XML without affecting other data


My nvarchar value contains a control character - Start of Text (0x0002 / STX). I would like to include this in my XML value. Now I know XML is not tolerant of control characters so the only way I can come up with is to include it as  (normally it should be  but XML does not like this). A side effect of doing this the way I am doing it is that normal characters get double-escaped, e.g. & becomes &.

Is there an easy way to include this character, and any other possible control characters in my XML value without having normal characters double-escaped? It doesn't have to even be as , I'd be just as happy with 0x0002, for example.

DECLARE @Data table (String nvarchar(max));
 
INSERT INTO @Data (String)
VALUES ('& hi');

--I'd like this to output <Test>&amp;#x02;&amp hi</Test> or similar
SELECT CAST ((SELECT (SELECT String AS [*] FOR XML PATH('')) 
              FROM @Data 
              FOR XML PATH('Test')) AS XML)

Solution

  • While SQL Server can generate XML v1.1 conformant XML very easily:

    SELECT '&' + CHAR(2) + '& hi' AS Test
    FOR XML PATH('');
    

    you can't store that in the xml type, which only allows XML v1.0 characters. By the way, instead of doing CAST, you just add , TYPE to the FOR XML to get it in the xml type.

    So you would need to just use REPLACE to replace all disallowed characters.

    DECLARE @Data table (String nvarchar(max));
     
    INSERT INTO @Data (String)
    VALUES ('&' + CHAR(2) + '& hi');
    
    SELECT
      REPLACE(
      REPLACE(
      REPLACE(
      REPLACE(
      REPLACE(
      REPLACE(
      REPLACE(
      REPLACE(
      REPLACE(
      REPLACE(
      REPLACE(
      REPLACE(
      REPLACE(
      REPLACE(
      REPLACE(
      REPLACE(
      REPLACE(
      REPLACE(
      REPLACE(
      REPLACE(
      REPLACE(
      REPLACE(
      REPLACE(
      REPLACE(
      REPLACE(
      REPLACE(
      REPLACE(
      REPLACE(
      REPLACE(String,
        CHAR(0), '0x0000'),
        CHAR(1), '0x0001'),
        CHAR(2), '0x0002'),
        CHAR(3), '0x0003'),
        CHAR(4), '0x0004'),
        CHAR(5), '0x0005'),
        CHAR(6), '0x0006'),
        CHAR(7), '0x0007'),
        CHAR(8), '0x0008'),
        CHAR(11), '0x000B'),
        CHAR(12), '0x000C'),
        CHAR(14), '0x000E'),
        CHAR(15), '0x000F'),
        CHAR(16), '0x0010'),
        CHAR(17), '0x0011'),
        CHAR(18), '0x0012'),
        CHAR(19), '0x0013'),
        CHAR(20), '0x0014'),
        CHAR(21), '0x0015'),
        CHAR(22), '0x0016'),
        CHAR(23), '0x0017'),
        CHAR(24), '0x0018'),
        CHAR(25), '0x0019'),
        CHAR(26), '0x001A'),
        CHAR(27), '0x001B'),
        CHAR(28), '0x001C'),
        CHAR(29), '0x001D'),
        CHAR(30), '0x001E'),
        CHAR(31), '0x001F')
     AS Test
    FROM @Data
    FOR XML PATH(''), TYPE;
    

    db<>fiddle