xmloracle11gnamespacesqxmlquery

Oracle XMLQuery inserting unwanted namespace


Oracle 11.2

Below is a cut down version of an XMLQuery i'm running on an XMLType column. When I run the query, which simply parses and recreates the stored XML, the unwanted default and tsip namespaces get inserted into the child elements of the parent. Note that the tsxm namespace does not get inserted, this is because it is not equal to the default namespace This query does nothing and could easily be rewritten, but the real (much bigger) query uses this same methodology so this is why i'm posting the question in this format.

create the table:

CREATE TABLE XML_DOCUMENT_TMP
(
  DOCUMENT_ID   NUMBER(12)                      NOT NULL,
  XML_DATA      SYS.XMLTYPE                     NOT NULL,
  CREATED_DATE  TIMESTAMP(6)                    NOT NULL
);

Insert some data (which must have the namespaces as is):

insert into XML_DOCUMENT_TMP
(document_id,created_date,xml_data)
values(1,sysdate, 
'<patent  xmlns="http://schemas.thomson.com/ts/20041221/tsip" 
    xmlns:tsip="http://schemas.thomson.com/ts/20041221/tsip" 
    xmlns:tsxm="http://schemas.thomson.com/ts/20041221/tsxm"  
    tsip:action="replace" tsip:cc="CA" tsip:se="2715340" tsip:ki="C">
    <accessions tsip:action="replace">
        <accession tsip:src="wila" tsip:type="key">CA-2715340-C</accession>
        <accession tsip:src="tscm" tsip:type="tscmKey">CA-2715340-C-20150804</accession>
    </accessions>
    <claimed tsip:action="replace">
    <    claimsTsxm tsip:lang="en">
            <tsxm:heading tsxm:align="left">We Claim:</tsxm:heading>
            <claimTsxm tsip:no="1" tsxm:num="1" tsip:type="main">1.  power.       </claimTsxm>
      </claimsTsxm>
  </claimed>
</patent>
');

Run the XMLQuery:

Note the need for namespace wildcarding is explained here

WITH tmpTable AS (
SELECT * FROM XML_DOCUMENT_TMP cm )
SELECT tt.xml_data ,
XMLQuery('declare default element namespace  "http://schemas.thomson.com/ts/20041221/tsip";
  declare namespace  tsip="http://schemas.thomson.com/ts/20041221/tsip";
  declare namespace  tsxm="http://schemas.thomson.com/ts/20041221/tsxm"; 


  return          
  <patent>{$m/*:patent/@*}
  {
    for $i in $m/*:patent/*
        return    $i
  }
  </patent>' 
        PASSING tt.xml_data as "m"   RETURNING CONTENT) newXml 
 FROM tmpTable tt
 WHERE tt.document_id in (1);

Returns:

<patent xmlns="http://schemas.thomson.com/ts/20041221/tsip" xmlns:tsip="http://schemas.thomson.com/ts/20041221/tsip" tsip:action="replace" tsip:cc="CA" tsip:se="2715340" tsip:ki="C">
    <accessions xmlns="http://schemas.thomson.com/ts/20041221/tsip" xmlns:tsip="http://schemas.thomson.com/ts/20041221/tsip" tsip:action="replace">
        <accession tsip:src="wila" tsip:type="key">CA-2715340-C</accession>
        <accession tsip:src="tscm" tsip:type="tscmKey">CA-2715340-C-20150804</accession>
    </accessions>
    <claimed xmlns="http://schemas.thomson.com/ts/20041221/tsip" xmlns:tsip="http://schemas.thomson.com/ts/20041221/tsip" tsip:action="replace">
        <claimsTsxm tsip:lang="en">
            <tsxm:heading xmlns:tsxm="http://schemas.thomson.com/ts/20041221/tsip" tsxm:align="left">We Claim:</tsxm:heading>
            <claimTsxm tsip:no="1" xmlns:tsxm="http://schemas.thomson.com/ts/20041221/tsip" tsxm:num="1" tsip:type="main">1.  power.</claimTsxm>
        </claimsTsxm>
</claimed>

How do I get rid of the unwanted namespaces created in the accessions and claimed elements. Any suggestions appreciated.


Solution

  • If you play around with the various values of the namespaces you can see that while the top <patent> level the namespaces are declared and included because of the declarations you make, at the child element level this information isn't used in the way that you are expecting.

    The XQuery is extracting the namespaces based on those in use in the nodes being considered in that execution loop, independent of the documents as a whole. This is why they get "re-declared" each time the XQuery goes round the loop.

    Other articles explain that what you're trying to do is to "Parse" the data as well as "Extract" it, which is true to an extent, and so XSLT is the right tool rather than XQuery.

    One external link I found which has an XQuery way of stripping the namespaces and so returning you the "raw" XML is here.

    Applying that code to your XQuery has got me to:

    SELECT xmlquery('xquery version "1.0"; (: :)
                 declare default element namespace 
                            "http://www.somewherein.uk/ns/1.0"; (: :)
    
                 declare function local:strip-namespace($inputRequest  as element()) as element()
                 {
                    element {xs:QName(local-name($inputRequest ))}
                    {
                      for $child in $inputRequest /(@*,node())
                        return
                          if ($child instance of element())
                          then local:strip-namespace($child)
                          else $child
                    }
                 }; (: :)
    
                 <patent>
                 {
                 for $s in /*:patent/*
                  return local:strip-namespace($s)
                 }
                 </patent>' 
                 PASSING cmf.XML_DATA 
                 RETURNING content)
    FROM XML_DOCUMENT_TMP cmf WHERE cmf.DOCUMENT_ID=1
    

    Some further editing got me to the below, which I think is what you were after (namespaces defined at the patent level)

    SELECT xmlquery('xquery version "1.0"; (: :)
                 declare default element namespace 
                            "http://www.somewherein.uk/ns/1.0"; (: :)
    
                 declare function local:strip-namespace($inputRequest as element()) as element()
                 {
                    element {fn:name($inputRequest)}
                    {
                      for $child in $inputRequest /(@*,node())
                        return
                          if ($child instance of element())
                          then local:strip-namespace($child)
                          else $child
                    }
                 }; (: :)
    
                 <patent>
                 {
                 for $s in /(*:patent, node())
                  return local:strip-namespace($s)
                 }
                 </patent>' 
                 PASSING cmf.XML_DATA 
                 RETURNING content)
    FROM XML_DOCUMENT_TMP cmf WHERE cmf.DOCUMENT_ID=1;
    

    As commented below that led to some duplication in the loop code, due to some issues in the XPath. It also meant that the txsm namespace was declared a couple of times; the XQuery declares it "the first time" it encounters the namespace being used it as it walks that tree branch meaning that if there are siblings that use the ns then it will be declared multiple times. By moving the explicit placement of the declaration back to the parent node we can eliminate that.

    SELECT xmlquery('xquery version "1.0"; (: :)
                 declare default element namespace  "http://schemas.thomson.com/ts/20041221/tsip"; (: :)
                 declare namespace  tsip="http://schemas.thomson.com/ts/20041221/tsip"; (: :)
                 declare namespace  tsxm="http://schemas.thomson.com/ts/20041221/tsxm"; (: :)
    
                 declare function local:strip-namespace($inputRequest as element()) as element()
                 {
                    element {fn:name($inputRequest)}
                    {
                      for $child in $inputRequest /(@*,node())
                        return
                          if ($child instance of element())
                          then local:strip-namespace($child)
                          else $child
                    }
                 }; (: :)
    
                 <patent xmlns:tsxm="http://schemas.thomson.com/ts/20041221/tsxm" xmlns:tsip="http://schemas.thomson.com/ts/20041221/tsip">
                 {
                 for $s in /*:patent/*
                  return local:strip-namespace($s)
                 }
                 </patent>' 
                 PASSING cmf.XML_DATA
                 RETURNING content)
    FROM XML_DOCUMENT_TMP cmf WHERE cmf.DOCUMENT_ID=1;