sql-serverxmlcdatafor-xml-pathfor-xml-explicit

Storing the text of a stored procedure in an XML data type in SQL Server


I need to store the text of all of the stored procedures in a database into an XML data type. When I use, FOR XML PATH, the text within in the stored procedure contains serialized data characters like 
 and 
 for CRLF and ", etc. I need the text to stored in the xml structure without these characters because the text will need to be used to recreate the stored procedure.

This is the query that I use for FOR XML PATH:

SELECT 
    [View].name AS "@VName", [Module].definition AS "@VDefinition"
FROM 
    sys.views AS [View] 
INNER JOIN 
    sys.sql_modules AS [Module] ON [Module].object_id = [View].object_id
FOR XML PATH ('View'), TYPE

I read that I should use CDATA for the text using FOR XML EXPLICIT. However, the output of the when I run the following query and view the XML data, it contains those characters also. I need the text to be in plain text without these characters.

This is my query:

SELECT  
    1 AS Tag,
    0 AS Parent,
    NULL AS [Database1!1],      
    NULL AS [StoredProcedure!2!VName],
    NULL AS [StoredProcedure!2!cdata]

UNION ALL

SELECT  
    2 AS Tag,
    1 AS Parent,        
    NULL,
    [StoredProcedure].name as [StoredProcedure!2!!CDATA],
    [Module].definition as [StoredProcedure!2!!CDATA]
FROM 
    sys.procedures AS [StoredProcedure] 
INNER JOIN 
    sys.sql_modules [Module] ON [StoredProcedure].object_id = [Module].object_id
WHERE 
    [StoredProcedure].name NOT LIKE '%diagram%'
FOR XML EXPLICIT    

How can I store the text of a the stored procedures that is in plain text? Or when I parse the xml data type to recreate the stored procedure can I deserialize it so that it does not have those characters?

Ideally, I would like to use FOR XML PATH but if that is not possible I will use FOR XML EXPLICIT.


Solution

  • If you want to store data with special characters within XML, there are two options (plus a joke option)

    The point is: You do not need this!

    The only reason for CDATA (at least for me) is manually created content (copy'n'paste or typing). Whenever you build your XML automatically, you should rely on the implicitly applied escaping.

    Why does it bother you, how the data is looking within the XML?

    If you read this properly (not with SUBSTRING or other string based methods), you will get it back in the original look.

    Try this:

    DECLARE @TextWithSpecialCharacters NVARCHAR(100)=N'€ This is' + CHAR(13) + 'strange <ups, angular brackets! > And Ampersand &&&';
    
    SELECT @TextWithSpecialCharacters FOR XML PATH('test');
    

    returns

    € This is
    strange &lt;ups, angular brackets! &gt; And Ampersand &amp;&amp;&amp;
    

    But this...

    SELECT (SELECT @TextWithSpecialCharacters FOR XML PATH('test'),TYPE).value('/test[1]','nvarchar(100)');
    

    ...returns

    € This is
    strange <ups, angular brackets! > And Ampersand &&&
    

    Microsoft decided not even to support this with FOR XML (except EXPLICIT, which is a pain in the neck...)

    Read two related answers (by me :-) about CDATA)